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Avian CHD genes and their use In methods for sex Identification 1n birds 
5 Introduction 



The present invention relates to proteins, polypeptides, 
nucleic acid fragments, antibodies and related products and to their use in 
medicine and agriculture, for instance in diagnosis and therapy. More 
10 particularly the invention relates to a gene or genes which can be used to 
ascertain the sex of avian adults, embryos, cells, and tissues. These 
- genes also control the sex of birds starting with action in the embryos and 
so control the sex of the progeny of birds 

Much of our understanding of sex determination comes from 
15 three, extensively studied, model systems. In two of these, the fruitfly 
Drosophila melanogaster and the nematode Caenorhabitis elegans, it is 
the ratio of X chromosomes to autosomes that initiates sexual 
differentiation (Hodgkin 1992). In the mouse a single gene, SRY, located 
on the Y chromosome provides the impetus for male development; a 
20 pattern that is thought to be conserved throughout the mammals 
(Koopman etai 1991 Foster, et a/. 1992). 

At the genetical level these three species employ very 
different molecular mechanisms, not only to control sex determination itself 
but to accommodate the differing dosages of genes that result from the 
25 males possessing a single X and the female two X chromosomes. These 
basic differences are largely due to the independent evolution of the three 
mechanisms and strongly suggests that other means of sex determination 
will have evolved elsewhere in the animal kingdom. 

One class in which little is known about sex determination is 
30 the birds. They exhibit female heterogamety which means that the female 
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has Z and W sex chromosomes and the male ZZ. This immediately 
suggests that sex determination in this class has an independent origin to 
that of their sister class, the mammals where it is the male that is 
heterogametic. Furthermore, it has been shown that whilst female 
5 mammals inactivate one of their X chromosomes as a method of dosage 
compensation (Grant & Chapman 1988), this does not seem to be a device 
employed by birds (Baverstock et al. 1982). 

However, similarities do exist between the birds and 
mammals. The W chromosome, like the Y chromosome is usually smaller 

io than its partner, and is also characteristically heterochromatic in 

appearance (Christidis 1990). The main exceptions to this rule are found 
in the •primitive 1 representatives of both classes: the monotremes and the 
ratites where the morphological differences between the sex chromosomes 
are poorly defined (Graves 1987, Tagaki et a/. 1972). 

1 5 The heterochromatization of the W and Y results from the 

replacement of functional genetic loci with 'junk DNA' sequences. This 
process is thought to be a consequence of a suppression of recombination 
that has arisen to ensure that genes vital to the development of the 
heterogametic sex remain linked on the Y or W chromosome 

20 (Charlesworth 1991). As a result only a few genes such as Ubely (Kay et 
al. 1991, Mitchell etai 1991), Zfy (Page et al. 1987) and SRY itself remain 
on the mammalian Y chromosome. A similar situation is thought to prevail 
on the avian W chromosome where the presence of any functional genes 
has yet to be demonstrated, although it does possess vast arrays of 

25 repetitive elements (Griffiths & Holland 1 990, Tone et al. 1 982) . 

A further similarity in sex determination in birds and mammals 
is that the development of the male phenotype appears crucially 
dependent on the appearance of the testis. The female phenotype is the 
result of the 'default pathway 1 . For mammals this was first demonstrated 

30 by Jost (1947) who grafted an embroyonic testis into genetically female 
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rabbit embryos prior to sex determination. This was sufficient to allow the 
development of functional males. The same experiment has been carried 
out on chick embryos with comparable results (Stoll et a/. 1978). 

Once the testis has formed, the process of masculinization is 

5 adopted by the testicular hormones. The genetical switch that initiates 
testis determination is known to be SRY in mammals (Koopman et aL 
1991). In birds, there appears to be no SRY homologue on the W 
chromosome (Griffiths 1991), although this is unsurprising given the 
separate evolution of sex determination in the two classes. 

io The only other pertinent evidence on the genetics of avian 

sex determination come from reports of chickens with abnormal 
chromosome complements. Table 1 shows data from Crew (1954) and 
McCarrey ?nd Abbott (1979) on the phenotypes of the aneuploids so far 
described. These results suggest that the presence of the W chromosome 

15 in the aneuploid AA ZZW and the polyploid AAA ZZW has not acted as a 
dominant determinant of the female phenotype. This may mean that sex in 
birds may be determined more by the autosome to Z ratio, as in Drosphila 
and C. elegans. However, a ZO aneuploid which could confirm this 
hypothesis has yet to be described. 

20 It must also be bom in mind that XXY kangaroos, where SRY 

is thought to be the key male determining switch, exhibit both male and 
female characteristics (Graves 1987). This suggests that the limited 
aneuploid data that is available for birds should be interpreted with some 
caution. 

25 To conclude, the genetic mechanism that controls sex 

determination in birds has not yet been elucidated. Here we suggest that a 
gene we have termed CHD-lV(Chromodomain-He//case-DNA binding on 
the W chromosome) alone or acting in conjunction with a closely related 
gene CHD-1A (Chromodomain-He//case-DNA binding 1 Avian) initiates 

30 female development in birds. 
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The Invention 

It is believed that all birds such as chickens and other species 
of commercial significance, will have two or more genes of the CHD type 
which will have a nucleotide sequence similar to the nucleotide sequences 
5 shown in Fig. 5, Fig. 7 and Fig. 8 and that the gene products will be 
proteins which are crucial to the determination of the sex of the organism. 
One of these genes will be located on the W chromosome and the other on 
an autosome or Z chromosome. 

It will be understood that the exact sequence of the two 
i o genes will vary between species and between individuals of the same 
species at least at the nucleotide level and often also at the protein level. 
Complete or partial sequences of the chicken genes are shown in Fig. 5, 
Fig. 7 and Fig. 8. The gene or protein which contains sequence 
corresponding to those in Fig. 5, Fig. 7 and Fig. 8 will hereafter be referred 
to as an CHD-gene and proteins and fragments thereof, polypeptides, 
nucleic acids and fragments thereof and oligonucleotides containing part of 
a CHD gene will hereafter be referred to as CHD-proteins, CHD-nucleic 
acids and so on. 

The present invention therefore provides a CHD-protein or a 
fragment thereof or polypeptide comprising a CHD-gene or a part thereof, 
subject to the proviso below. 

The present invention also provides a protein or a fragment 
thereof or a polypeptide containing a mimetope of an epitope of a CHD- 
protein or fragment thereof of polypeptide containing a CHD-gene or a part 
thereof, subject to the proviso below. Such proteins, fragments and 
polypeptides are hereafter referred to as CHD-mimetope proteins or 
fragments thereof and CHD-mimetope polypeptides. 

Jhe present invention also provides a CHD-nucleic acid or a 
fragment thereof or oligonucleotide comprising a CHD-gene, or a part 
thereof subject to the proviso below. 
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In a particular aspect the present invention provides a single 
or double stranded nucleic acid comprising the CHD-gene of a bird or a 
part thereof of at least 17 contiguous nucleotide bases or base pairs, or a 
single or double stranded nucleic acid hybridizable with the CHD-gene of a 
5 bird, or part thereof of at least 1 7 contiguous nucleotide bases or base 
pairs, subject to the proviso below. 

The invention further provides a nucleic acid or fragment 
thereof or an oligonucleotide encoding a CHD-protein or fragment thereof 
or a polypeptide comprising a CHD-gene or a part thereof or a CHD- 
i o mimetope protein or a fragment thereof or CHD-mimetope polypeptide, 
subject to the following proviso. These nucleic acids, fragments and 
oligonucleotides may have sequences differing from the sequences of 
CHD-nucieic acids, fragments and oligonucleotides due to alternative 
codon usage and/or encoding alternative amino acids sequences or 
15 mimetopes. 

The present invention does not, however extend to any 
known protein or fragment thereof or polypeptide or nucleic acid or 
fragment thereof or oligonucleotide containing a CHD-gene related 
sequence such as the Saccharomyces cerivisiae SNF2/SWI2 gene, 

20 Drosophila polycomb and HP1 genes described below, insofar as that 
protein or fragment, polypeptide, nucleic acid or fragment or 
oligonucleotide is known perse. 

The amino acid sequence of the CHD-gene has similarities to 
the chromobox and Helicase motifs of a number of discovered genes 

25 known to be involved in the remodelling of chromatin. This suggests that 
the CHD-protein of the present invention may have a regulatory function 
involving chromatin remodelling. However, none of these genes contain 
the chromobox and the Helicase of the CHD-gene which are conserved in 
conjunction, at least in the chicken, great tit, mouse and yeast but are not 

30 conserved in conjunction in the sequences of chromatin remodelling 



WO 96/39505 



-6- 



PCT/GB96/01341 



proteins not associated with sex determination at least at the stage of testis 
formation in birds. A gene that produces a protein having chromatin 
remodelling capacity but lacking these characteristic motifs is therefore 
outside the scope of the present invention. 

In addition there are certain residues in the amino acid 
sequence of the chromobox and those residues immediately downstream 
thereof, of the CHD-gene which are also conserved at least between those 
found in the chicken, great tit, mouse and yeast but are not conserved in 
the sequences of chromatin remodelling proteins not associated with sex 
determination at least at the stage of testis formation in birds. Any one of 
these conserved residues is therefore considered characteristic of the 
CHD-gene proteins of the present invention. The characteristics of a 
CHD-chroniobox will give a more complete and comprehensive description 
of the CHD-chromobox which can also be considered characteristic of the 
CHD-gene proteins of the present invention. A protein having chromatin 
remodelling capacity and a helicase motif but originating from a gene that 
lacks all or most of these characteristic amino acid residues in the 
chromobox motif is therefore outside the scope of the present invention. 

The characteristic amino acids residues are shown in the 
alignment in Fig. 11, which is described in more detail below. When 
aligned with the illustrated sequences as shown, these residues fall at 
positions, 11,12, 20, 27, 34 inside the chromobox and 3, 6, 8, 12-15, 16 
immediately downstream. 

The nucleotide base sequence of the CHD-gene includes 
bases which encode the chromobox and Helicase motifs of chromatin 
remodelling proteins as described above. However, the base sequence of 
the CHD-nucleic acids of the gene will include codons specifying both or 
either chromobox and Helicase motifs and the former -will have codons 
specifying one or more of the characteristic amino acid residues described 
above and/or will be hybridizable with a sequence that controls the sex 
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determination of birds under conditions which substantially prevent 
hybridization to other sequences in birds that do not have these 
characteristics. 

Preferably the CHD-nucleic acids of the invention encode a 

5 chromobox and a helicase and one or more, preferably all, of the 
characteristic chromobox amino acid residues and meet the above 
hybridization requirements. 

Fragments of CHD-nucleic acids according to the present 
invention will likewise contain codons specifying the chromobox and 

10 helicase motifs or including at least part of either of these motifs or CHD- 
gene adjacent to the codons encoding these features and/or will be 
hybridizable with a sequence that controls the sex determination of birds 
under conditions which substantially prevent hybridization to other 
sequences in birds that do not have these characteristics. 

15 Oligonucleotides containing the CHD-gene or a part thereof 

according to the present invention may contain codons specifying the 
chromobox or helicase motifs or including at least part of these motifs or 
CHD-gene but this is not essential. However all such oligonucleotides of 
the invention must be capable of hybridizing with a sequence or sequences 

20 that control the sex determination of birds or a gene intron, preferably 
under conditions which substantially prevent hybridization with any 
sequence not associated with sex determining sequence. 

A sex determining sequence referred to herein is a sequence 
which contains the CHD-gene and which encodes afactor which when 

25 expressed at the appropriate stage and level during embryo development 
may result in testis formation and subsequent growth of the embryo as a 
male. It may alternatively refer to a sequence which encodes a factor 
. which when expressed at the appropriate stage and level during embryo 
development prevents testis formation and results in the subsequent 

30 growth of the embryo as a female. 
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The hybridization conditions referred to above which prevent 
unwanted hybridization with sequences not associated with the sex 
determining gene will depend to some extent on the length of the nucleic 
acid, fragment or oligonucleotide of the invention tested. Thus for instance 
5 lower stringency will be sufficient to secure hybridization to sequences 
associated with the sex determining gene whilst preventing unwanted 
hybridization when the nucleic acid or fragments several thousand 
nucleotide base pairs in length than for a fragment of only a few hundreds 
of bases or an oligonucleotide of from 17 bases up to a few tens or 
io hundreds of bases. With the smallest oligonucleotides and fragments of 
the invention hybridization conditions will be such that only complete 
complementarity between the oligonucleotide and or fragment and the 
sequences associated with the sex determining gene will result in 
hybridization. 

15 Preferred nucleic acids and fragments of the invention will 

only hybridize selectively to the sequences associated with the sex 
determining gene or genes under conditions requiring at least 80%, for 
instance 85, 90 or even 95% more preferably 99% complementarity. Yet 
more preferred nucleic acids and fragments of the invention are those 

20 having a sequence corresponding exactly to that of those illustrated in Fig. 
5, Fig. 7 and Fig. 8 although the nucleotide sequences by be longer or 
shorter than those illustrated and or may contain normally intronic 
sequences associated with these sequences 

The invention particularly provides an oligonucleotide, 

25 polypeptide, nucleic acid or protein comprising the entire sequence of the 
CHD-gene of a bird and more preferably comprising the entire amino acid 
or nucleotide sequence of the chicken as set out in any one of Figs 1 , 3, 5, 
7,8,9, 10,11. ' ' 

The nucleic acids hybridizable with the CHD-gene of a bird 
30 are preferably hybridizable under moderate, or more preferably, high 
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stringency conditions as defined below: 
Moderate stringency: 



Buffer: 2 x SSC 

5 Temp: 5o°c 

Annealing period: 6-8hrs 

High stringency: 

Buffer 1 x SSC 

10 Temp: 65 6 C 

annealing period: 6-8hrs 



Moderate stringency as defined above corresponds with 
about 75% homology. High stringency as defined above corresponds with 
15 about 90% homology. 1 x SSC is 0.15 M sodium chloride, 0.015 M sodium 
citrate, pH 7.0. 

Preferably the portion of the nucleic acid conesponding to or 
hybridizable with the CHD-gene is at least 20, more preferably at least 30, 
40 or 60 and most preferably 100 or more nucleotide bases in length. 

20 Tne nucleotide strands of the invention may be single or 

double stranded DNA or RNA. DNA's of the invention may comprise 
coding and/or non-coding sequences and/or transcriptional and or 
translational start and/or stop signals and/or regulatory, signal and/or 
conteoLse^uences-such-as-promotors, enhancers* aTTdforpolyadenylatibn 

25 sites, endonuclease restriction sites and/or splice donor and/or acceptor, in 
addition to the CHD-gene sequence. Included within the DNA's of the 
invention are genomic DNA's and complementary DNA's (cDNA's) 
including functional genes or at least an exon containing the CHD-gene. 
They may also contain non-coding sequences such as one or more 

30 introns. Single stranded DNA may be the transcribed strand or the non- 
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transcribed (complementary) strand. The nucleic acids may be present in 
a vector, for instance a cloning or expression vector, such as a plasmid or 
cosmid or a viral genomic nucleic acid. RNA's of the invention include 
unprocessed and processed transcripts of DNA, messenger RNA (mRNA) 
containing the CHD-gene and anti-sense RNA containing a sequence 
complementary to the CHD-gene. 

Nucleic acids of the present invention are particularly useful 
as primers for polymerase chain reactions (PCRs) conducted to ascertain 
the sex of a bird as defined below. They may also be used to express 
proteins or fragments or polypeptides corresponding to the whole or a part 
of a CHD-protein (whether or not containing a CHD-gene) or as probes in 
hybridization experiments. As used herein the term -fragments" used in 
connection with proteins is intended to refer to both chemically produced 
and recombinant portions of proteins. 

The CHD-proteins and fragments thereof and polypeptides 
containing the CHD-gene or a part thereof and CHD-mimetope proteins 
and fragments thereof and CHD-mimeotope polypeptides of the invention 
are useful in immunodiagnostic testing and for raising antibodies such as 
monoclonal antibodies for such uses. Antibodies against such proteins 
and fragments and polypeptides as well as fragments of such antibodies 
(which antibody fragments include at least one antigen binding site) 
including chemically derived and recombinant fragments of such 
antibodies, and cells, such as eukaryotic cells, for instance hybridomas and 
prokaryotic recombinant cells capable of expresstag^and, preferably- 
secreting antibodies or fragments thereof against such proteins or 
fragments, also form part of the present invention. 

The nucleic acids of the invention may be obtained by 
conventional means such as by the recovery from organisms using PCR 
technology or hybridization probes, by de novo synthesis or a combination 
thereof, by cloning the CHD-nucleic acids described below or a fragment 
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thereof or by other techniques well known in the art of recombinant DNA 
technology. 

Proteins and fragments thereof and polypeptides of the 
invention may be recovered from cells of organisms expressing a CHD- 
gene or generated by expression of a CHD-gene or coding sequence 
contained in a nucleic acid of the present invention in an appropriate 
expression system and host, or obtained by de novo synthesis or a 
combination thereof, by techniques well known in the art of recombinant 
DNA technology. The proteins, fragments thereof and polypeptides of the 
invention will contain naturally occurring L-a-amino acids and may also 
contain one or more non-naturally occurring a-amino acids having the D- or 
L- configuration 

Antibodies may be obtained by immunization of a suitable 
host animal and recovery of the antibodies, by culture of antibody 
producing cells obtained from suitably immunized host animals or by in 
vitro stimulation of B-cells with a suitable CHD-protein, fragment or 
polypeptide or CHD-mimetope, protein, fragment or polypeptide and 
culture of the cells. Such cells may be immortalized as necessary for 
instance by fusion with myeloma cells. Antibody fragments may be 
obtained by well known chemical and biotechnological methods. 

All these techniques are well known to practitioners of the 
arts of biotechnology. Reference may particularly be made to the well 
known text book "Molecular cloning: A laboratory manual" 2nd Edition (Eds 
Sambrook w J.,Jrftsch,,E.F..aiidManiat^ 
Laboratory, New York, 1989), hereafter referred to as "Maniatis". 

The invention further provides the use of a nucleic acid, 
protein, polypeptide, antibody, or antibody producing cell as hereinbefore 
defined including the SNF2/SWI2, polycomb and HP1 or other- chromobox 
or helicase containing protein for ascertaining the sex of a cell or organism 
of a bird or for isolating nucleic acids useful in ascertaining the sex of a bird 
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and for instituting single sex breeding programmes. 

Knowledge of the chicken or great tit sex determining gene or 
genes can be used to isolate the equivalent gene or genes from other 
birds. Once isolated from a particular species, this gene or genes and its 
sequence can typically be used in two types of application: 

1 . The construction of sequence based sexing tests which can 
be applied to embryos, tissues and other biological materials containing 
nucleic acids. 

2. The genetic modification of the germ line of birds to create 
breeding systems that produce offspring statistically biased towards one 
sex or of one sex only (single sex breeding systems). 

A particularly preferred technique for ascertaining the sex of a 
bird in accordance with the invention involves the use of an 
oligonucleotides as primers in a PCR, for instance as follows: 

A cell or cells or remains thereof are obtained, for instance by 
surgical removal from an embryo or from the quill of a feather, and the 
DNA is released by a crude lysis procedure for instance using a detergent 
or by heating. Primer olignucleotides of the invention are used to initiate a 
conventional PCR in order to amplify W chromosome linked CHD-related 
DNA from the cells. The products of the PCR are analysed by agarose gel 
electrophoresis and detected using labelled probes or by visual inspection. 
The presence of amplified CHD-W DNA indicates the presence of a CHD- 
Wgene in the cells and thus, in birds, that the cell(s) were female. An 
example of a similar technique has-been- earned- out by Griffiths^ Tiwari 
(1995) on the Spix's Macaw (Cyanopsitta spixii). This is the world rarest 
bird (Guiness Book of Records) and DNA obtained from a moulted feather 
was sufficient to allow nested PCR amplification with CHD primers to show 
the bird was a male. 

This technique may be applied for instance to identify the sex 
of embryos or adults for subsequent breeding programs in other bird 
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species, or to control the sex of the progeny of breeding stock for 
commercial exploitation (by selection of the breeding stock or by slaughter 
or termination of animals of undesired sex). 

The oligonucleotide primers for ascertaining or controlling sex 
5 in one species may also be used to ascertain or control sex in another 
species since hybridization of the primers to the CHD-gene of the other 
species will still serve to amplify the species-specific sequences. 

Techniques for conducting such determinations are well 
known in the art of recombinant DNA technology. 

10 ,n another aspect the present invention provides a process 

for isolating a W-chromosome specific sequence associated with the CHD- 
Wgene of a bird which comprises probing a genomic library from a female 
of the species preferably of W chromosome sequences, for instance of 
lambda phage, cosmid or YAC library or cDNA library constructed from a 

15 tissue expressing the gene, with a probe comprising a nucleic acid, 

fragment or oligonucleotide of the invention as hereinbefore defined and a 
detectable label under high or moderate stringency. 

Using the newly isolated subclone, Southern blots are 
performed on male and female DNA of the species of interest at high 

20 stringency to confirm that the correct clone has been isolated. The CHD- 
gene probe should give a female specific signal (other male/female shared 
bands may also be present at lesser intensities). The subclone is 
sequenced using standard methods and primers suitable for PCR chosen 
from the sequence so identified^ 

25 Alternatively, other approaches to cloning the sequences 

related to the sex determining gene could be used such as PCR methods 
using "degenerate" oligonucleotides. (For methods in PCR see, for 
. example, "PCR Protocols - a Guide to Methods and Application"; edited by 
MA. Innis, D.H. Gelfand, J.J. Sninsky, T.J. White; published by Academic 

30 Press, Inc.). 
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Preferably the probe is CHD-1A or CHD-Wox a fragment 
thereof or a nucleic acid or fragment or oligonucleotide having a sequence 
exactly as set out in Fig. 5, Fig. 7 or Fig. 8 for the chicken. Techniques for 
forming a genomic or cDNA library and for probing and detecting the 
5 detectable label and isolating the nucleic acid identified by the probe are 
well known in the art of biotechnology and recombinant DNA manipulation. 
The process may be conducted for instance using a probe having the 
chicken sequence such as the CHD-W sequence to identify and isolate the 
corresponding sequence from another bird such as Turkey. The thus- 

io identified sequence can then be used to generate primers for PCR which in 
turn can be used to ascertain the sex of an individual or of cells, tissues, 
embryos or ovaries of the bird. This technique has been used by obtaining 
DNA from the Chicken and Hyacinth Macaw (Anodorhynchus hyacinthinus) 
to design primers for the Spix's Macaw (Griffiths & Tiwari 1995). This will 

is permit experiments to ascertain sex to be conducted and controlled sex 
breeding of the bird as described below. 

In addition, the nucleotide sequence of the CHD-genes are 
sufficiently conserved so that CHD primers can be designed that will allow 
PCR in a range of bird species. The primers P1, P2 and P3 shown in 

20 Figure 14 will allow CHD-W and CHD-1 A amplification in a range of birds 
that allows sex to be identified. 

The isolated nucleic acid, fragment or oligonucleotide may 
thereafter be amplified, cloned or sub-cloned as necessary. The invention 
further provides a process for detecting the se*of arrindmdrahbircforof~- 

25 cells, tissues, embryos, foetuses or ovaries or a bird, comprising 

conducting a polymerase chain reaction using DNA from the individual, 
cell, tissue, embryo or ovary as template and a nucleic acid, fragment or 
oligonucleotide of the invention as primer. Preferably the nucleic acid, 
fragment or oligonucleotide of the invention used as primer is CHD-W or 

30 CHD-1 A or a part thereof and has a sequence corresponding exactly to the 
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chicken sequence in Fig. 5, Fig. 7 or Fig. 8 or a part thereof or is a nucleic 
acid, fragment or oligonucleotide which is a W-chromosome specific 
sequence associated with the sex determining gene or genes of a bird of 
the same species as the individual cell, tissue, embryo, foetus or ovary 
whose sex is to be ascertained. The W-chromosome specific sequence 
associated with the sex determining gene or genes of the bird involved 
may itself have been obtained by the process of isolation and amplification 
or cloning described above. It can also be obtained by deduction from the 
sequence in Fig. 5, Fig. 7 or Fig. 8 or a sequence from another bird or 
animal. 

The identification of the sex determining gene or genes 
according to the present invention raises the possibility of controlling the 
sex of progeny of commercially important animals such as chickens, 
turkeys and other avians. This will be valuable in many aspects of animal 
breeding and husbandry such as where one sex has more desirable 
characteristics, for instance only female progeny are desired for egg-laying 
breeds of chicken. The economic advantages of single sex breeding 
programmes and strategies for instituting these are described for instance 
in "Exploiting New Technologies in Animal Breeding; Genetic 
Developments", (Eds. Smith, C, King, J.Q.B. and McKay, J.C.), (Oxford 
University Press, Oxford, 1986). 

The nucleic acids making up all or part of the sex determining 
gene, from the same or different animal species, can be introduced into 
any early embryo through established-transgenic technology. This latter 
includes microinjection of DNA into pronuclei or nuclei of early embryos, 
the use of retroviral vectors with either early embryos or embryonic stem 
cells, or any transformation technique, (including microinjection, 
electroporation or carrier techniques) into embryonic stem cells or other 
cells able to give rise to functional germ cells. These procedures will allow 
the derivation of individual transgenic animals (founder transgenics) or 
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chimeric animals composed in part of cells carrying the introduced DNA. 
Where the functional germ cells of the founder transgenic or chimeric 
animal carry the introduced DNA it will be possible to obtain transmission 
of the introduced DNA to offspring and to generate lines or strains of 
5 animals carrying these DNA sequences. 

The nucleic acids making up part or all of the coding 
sequence of the sex determining gene, or derivatives of it, may be 
introduced in combination with its own regulatory sequences 
(promoter/enhancers etc.) or regulatory sequences from another gene, the 
i o whole making the "construct" , to give expression from the construct at an 
appropriate developmental stage and tissue location critical to sex 
determination in the bird species under consideration. For example, in the 
chicken this would be between 6 and 7 days post lay. 

15 Materials and Methods 

Isolation of pGT-W, pGT1.7 and pGTS Greaf Tit clones 

A great tit {Paws major) library was constructed from 

genomic DNA, partially restricted with Mbol, and the IFixll vector 
20 (Stratagene). The library was screened at high stringency with the 724bp 

probe (GT-W) cloned from a W chromosome specific polymerase chain 

reaction (PCR) product derived from the great tit (Griffiths & Tiwari 1993). 

Positive plaques were subject to two rounds of purification. Clone IGT2 

contained an insert of 9.6kb that hybridized strongly to the probe 
25 sequence. The insert was subcloned as two EcoRI fragments of 1 .7kb 

(pGT1.7) and 8kb (pGT8) into EcoRI cut pT7/T3 (Pharmacia). 

Isolation of CHD genes from the chicken 

Two chicken cDNA libraries were screened. The first was a mixed sex 
30 chick stage 10-12 cDNA library in IZapll which had been reamplified on 2 
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occasions This library was provided by Dr I. J. Mason. The second library 
was constructed from mixed sex, 10 day chick mRNA. Total RNA was 
extracted using a guanidine thiocyanate based technique (Koopman 1993) 
and mRNA isolated using a Promega PolyATtract system 1000. A IZapll 

5 library was constructed using a Stratagene ZAP-cDNA synthesis kit 
Plaques (2x10 s ) from the stage 10-12 day library were screened at 
moderate stringency with a subcloned 433bp Hindlll/Sacl fragment from 
pGT8 that contained the 123bp region with identity to the mouse CHD-1 
gene (Delmas et al. 1993). A similar number of plaques from both libraries 

10 were screened with bases 428-4428 of CHD-1 A (see Fig. 5 ). The 1 0 day 
library was also screened with bases 4059-5303 of CHD-1 A (see Fig. 5). 
Positive plaques were purified prior to the excision of pBluescript plasmids 
and cloned inserts insert from IZapll using techniques recommended by 
Stratagene. 

15 

Sequencing 

All sequencing was carried out using the T7 DNA 
polyrnerase/7-deaza-dGTP chain termination sequencing kit from USB. All 
sequencing unless otherwise specified was carried out in both directions 
20 either by subcloning or through exonuclease III deletion with the Promega 
Erase-a-Base system. 

Southern Blot Analysis and Hybridization 

Genomic DNA was extracted from blood (Griffiths & Holland 

25 1 990), digested with the appropriate restriction enzyme and Southern 
blotted onto Zeta-Probe GT under neutral conditions as described by the 
manufacturer (Bio-Rad). Prehybridizations and hybridizations were carried 
out in 0 : 25M Na 2 HPGy5% SDS at either 65°C (high stringency) or 62°C 
(moderate stringency). Subsequent washes were earned out for a total of 

30 1 hour in three changes of either 0.5 x SSC (75mM NaCI/7.5mM sodium 
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citrate (pH7.5))/0.1%SDS at 65jC (high stringency) or 1 x SSC/0.1%SDS 
at 45°C (low stringency). 

Sex identification with PCR on dried and limited DNA in a 
5 Spix's Macaw 

Stratagene provided a genomic Hyacinth Macaw Lambda 
Fixll Library (Cat No. 946402). Plaques were screened at moderate 
stringency with a 1 .3Kb Chicken CHD-W subclone (spans 2670-4003 
nucleotides in the related Mouse CHD1 gene (Delmas era/., 1993)). A " 

io CHD-Wgenomic fragment was isolated and aligned to the chicken and 
mouse homologues to allow the design and construction of 3 primers (5' to 
3") P3 AGATATTCCGGATCTGATAGTGA, 
P2 TCTGCATCGCTAAATCCTTT and 
P1 ATATTCTGGATCTGATAGTGA(C/T)TC. 

15 DNA from the wild Spix's Macaw was extracted (Thomas & 

Paabo 1993) from 1cm portions of the tips of 3 moulted flight feathers 
collected in 1994 and 1995. The negative extraction control was taken 
through an identical procedure. 1.5% of these extraction products or 50ng 
of genomic DNA from the reference samples were subject to semi-nested 

20 PCR. Primary amplification consisted of 20 cycles with primers P3 and P2; 
1 % of the primary PCR product was subject to 30 cycles of amplification 
with P2 and P1. Samples were denatured for 1.5 min at 95°C then cycled 
between 57°C/30 sec, 72°C/15 sec and 94°C/30 sec with a 5 min final 
extension. Products were precipitated, cut with Ddel, reprecip'itated and 

25 electrophoresed through visigel separation matrix (Stratagene). The 
accuracy of the test was confirmed using DNA from Spix's and Hyacinth 
Macaws of known sex (n=5p=0.03). Uncut secondary PCR product from 
the wi|d bird was isolated (Dretzen ef a/. 1981), cloned using the 
Stratagene pCR-Script SK(+) kit and sequenced to confirm that the product 

30 had originated from a Spix's Macaw 
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Sex identification with PCR in a variety of birds 

DNA was isolated from blood taken from Chicken (5 
individuals used), Marsh Harrier (28; Circus aeruginosas) and Kestrel (18 
Falco tinninculus) all sexed by adult plumage, Bee-eater (4; Merops 

5 apiaster, plumage/behaviour), Boobook Owl (2; Ninox novaesiae). White- 
faced Owl (2; Ptilopsis leuctis) Burrowing Owl (2; Speotyto cumcularia), 
Eurasian Eagle Owl (2; Bubo bubo). Long-eared Owl (2; Asio otus), Tawny 
Owl (3; Strix a/t/co, adult size), Starting (5; Stumus vulgaris: Beak colour) 
and African Marsh Warbler (5; Acrocephalus baeticatus; reproductive 

10 behaviour). DNA from a variety of parrots sexed by laparotomy was also 
used: Blue Fronted Amazon (3; Amazona a aestiva), Orange Winged 
- Amazon (5; Amazona amazonica), Red Lored Amazon (3; Amazona 
autumnalis), Yellow Crowned Amazon (2; Amazona o ochrocephala) t 
Tucamen Amazon (2; Amazona tucamana), Blue and Gold Macaw (6; Ara 

15 ararauna), Citron Crested Cockatoo (2; Cacatua sulphurea citronocristate). 
Lesser patagonian (2; Cyanolisous patagonus), Blue Headed Pionus (1; 
Pionus menstruus), Plum Headed Parakeet (4; PsMacula cyanocephala), 
African Grey Parrot (12; Psittacus erithacus), Blue Throated Conure (2; 
Pyrrhura cruentata), Senegal Parrot (3; Seneglus poicephalus). 

20 All the birds listed above were sexed from DNA using exactly 

the same PCR reaction. PCR reaction volumes of 20p.l were made up of 
Promega Taq buffer (1x is 50mM KCI, 10mM Tris.HCl, 1.5mM MgCI 2 , 0.1% 
Triton X-100), 200^M of each dNTP, P2 (5-TCTGCATCGCTAAATCCTTT) 
and P3 (5- AGATATTCCGGATCTGATA) primers (approx 1^M), 5Q-200ng 

2 5 of genomic DNA and 0.15 units of Taq polymerase. The thermal treatment 
was 94°C/1.5mins followed by 30 cycles of 55 or 56°C/15sec, 72°C/15sec, 
and 94°C/30sec with a finish of 56°C/1min and 72°C/5min. Haelll (5 units; 
Promega) was used to cut 8ul of PCR product in 1x Promega restriction 
enzyme buffer 3 and 50ng/jxl bovine serum albumin (Sigma) in a total 

30 volume of 10^1. The digests and uncut PCR product were precipitated 
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before being electrophoresed in a visigel (Stratagene) with ethidium 
bromide (40ng/ml) at 3.5V/cm. 

Results 

The plasmid pGT-W contains a 724bp insert that hybridizes 
to a 4.9kb fragment only in the female great tit. Its DNA sequence was 
determined (Fig.1) and contains a 457bp open reading frame. A search of 
the EMBL DNA and protein sequence database found no significant 
matches. The sequence does contain a simple sequence consisting of a 
22bp run of thymidines. 

The pGT-W insert was used to probe Southern blots, at low 
stringency, of Pvull restricted genomic DNA of male and female great tit, 
starling, jackdaw (Corvus monedula), pied wagtail (Motacilla alba) and a 
species of new world flycatcher. These are species that cover the 
extremes of the passeriforme order according to the recent phytogeny of 
Sibley et a/. (1988). In all but the jackdaw convincing hybridization to a 
single female specific fragment could be observed. In all species, 
hybridization to one or more non-sex specific fragments was also shown. 
A similar experiment was carried out with a non-passerine, the bee-eater 
(Merops apiastei), and this too resulted in faint hybridization to a female 
specific fragment and two, somewhat stronger bands, in both sexes. 

In order to further investigate the nature of the pGT-W insert 
we attempted to clone a larger fragment of genomic DNA which 
incorporated this motif. From around 1.5 x10 s plaques from a great tit 
genomic library, two positives were obtained. After purification one of 
these gave superior hybridization and was investigated further. The 9.7kb 
insert was subcloned as pGT1.7 and pGT8 containing 1.7kb and 8 kb 
respectively. The pGT1.7 was sequenced in its entirety and approximately 
2.8kb of the sequence of pGT8 was determined. Both were sequenced in 
a single direction. A 723bp region, starting 133bp from the 5' end of pGT8 
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had a sequence that corresponded exactly to the pGT-W insert (Fig. 2). 

The sequences derived from these subclones were used to 
search the EMBL database using the FASTA algorithms (GCG. Wisconsin 
package vers 7.3). A region of 123bp, starting 994bp from the 5' end of 
5 pGT8, showed a 79% nucleotide sequence identity to bases 3855-3977 of 
the mouse CHD-1 gene (Fig. 3; Delmas et al. 1993). This corresponds to 
an 88% identity at the amino acid level. 

Southern blots of Pvull digests of genomic DNA from male 
and femaie chicken and lesser black-backed gull (Larus fuscus) were 

io probed at low stringency with a 433bp Sacl/Hindlll fragment of pGT8 that 
included the 123bp region with CHD-1 identity (Fig. 4). Figure 12 shows 
that in the chicken hybridization was with a fragment of 3.1 kb in the female 
only and with fragments of 1.5 and 6.0kb in both sexes. In the gull 
hybridization is similarly with a female specific fragment of 4.0kb a 

1 5 fragment of 3.0kb in males and females. 

Delmas et al., (1993) have already demonstrated the 
universal occurrence of the CHD-1 in the mammals. The evidence this blot 
provides, which features species representing both the major divisions of 
the birds, suggests that a minimum of two types of CHD gene exist in this 

20 Class. The first we termed CHD-W to denote its W linkage. The 123bp 
region from the great tit would appear to be a short exon from this gene. 
The second hypothetical gene is closely related to CHD-W and we have it 
termed CHD-1 A, where the A denotes its avian nature. This gene is either 
Z or autosomally linked as it occurs in both sexes. 

25 

Isolation of CHD-1 A 

The Sacl/Hindlll great tit probe was used at low stringency to 
screen a IZap II cDNA library from stage 10-12 (33-49hrs after the 
appearance of the primitive streak) chicken embryos. A plating of 2x10 s 
30 plaques yielded a panel of 25 positive clones, 1 9 of these continued to 
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hybridize intensely after purification. From three clones Z4, Z6 and Z1 1 a 
composite 6608 nucleotide sequence (Fig. 5) was determined using the 
strategy illustrated in Fig. 6. 

The insert from the Z6 clone (bases 418-4426; Fig. 5) and a 

5 Bglll (AGATCT) fragment of the Z4 clone (bases 4059-5303; Fig. 5) were 
used separately to screen a similar number of plaques from a second 
cDNA library constructed from 10 day old chicken embryos. This 
screening identified a total of 45 positives of which 16 were found to have 
sequence identity with the composite sequence derived from the first 

10 library. Two additional clones contained a closely related sequence that is 
dealt with below. 

— A proportion of the clones from both libraries show variation 

from the sequence given in Fig. 5 in one respect. Clones Z1, Z13, Z17, 
Z20 and Z23 are identical to the composite sequence 5* to base 4327 from 

1 5 there they terminate in an additional 37 to 1 63 bases of a new sequence 
that is identical in all five. Two clones from the second library CC43 and 
CC56 have 22 or 254bp of the same sequence at their 5' ends. 
Downstream of this motif both clones regained homology with the 
composite sequence at base 4328 and show no further deviation from the 

20 original sequence. From these seven clones a composite 264bp sequence 
can be derived and this is illustrated in Fig. 7. None of the seven clones 
contain the whole of this sequence. Moreover, none of the ten clones that 
span the 4327/4328 insertion point contain any of this additional region. If 
inserted at this position, the motif has an in frame, open reading frame 

25 spanning its entire length. The motif is extremely adenosine rich and this 
makes the amino acid lysine extremely common in the putative translation 
(see Fig. 7). There are no splice donor or acceptor sites within the motif 
suggesting it is a final rather than an intermediary product of splicing. 

Hybridization of a probe running from 2534 to 4428bp of the 

30 sequence chicken sequence to a blot of Pvull cut, male and female 
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chicken genomic DNA shows that hybridization occurs to fragments that 
are both W and autosomally or Z chromosomaliy located. The level of 
hybridization is significantly stronger to the fragments common to both 
sexes suggesting that the probe represents the CHD-1A gene. 

5 CHD- 1A is very closely related to the mouse CHD-1 gene 

being 79.8% identical in a 5152nt overlap. At the amino acid level the 
identity is raised to 90% over 1750 residues. We do have an additional 
1202bp of the 3' untranslated region but have not encountered a clone with 
an AATAAA termination signal or a 3' homopolymeric T tail. Both mouse 

10 and chicken sequences contain a stop codon in the same relative positions 
and sequence similarity is insignificant after this point. The published 
mouse sequence does not contain the additional 264bp motif described 
above. 

The database search also identified an unpublished chicken 

15 derived sequence tagged as a delta crystallin binding protein (DCBP), with 
even greater identity than the mouse CHD-1 gene: 99% over 2293 bp and 
94% over 571 amino acid residues. The DCBP sequence is of 2292bp 
which extends over nucleotides 1922 to 4214 of CHD-1 A (Fig. 5). Despite 
the high nucleotide sequence identity the region of amino acid similarity 

20 does not extend the full length of the DCBP. This is due to apparent 
deletions in the DCBP clone that provides an initiation methionine codon 
(257nt DCBP) and a stop codon (1939nt DCBP). The extremely high 
sequence identity, the fact that identity is maintained after the apparent 
stop in the DCBP sequence, that none of the 41 CHD-related cloneawa. 

25 found have exact sequence identity and that only small sequencing 
mistakes would be required to introduce false stop and start codons 
suggests that the DCBP sequence is CHD-1 A but has been sequenced 
slightly inaccurately. Further evidence is required to confirm this. 

The database search with the whole CHD-1 A gene also 

30 revealed significant identity to a previously unidentified portion of a 1 5 kb 
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region of S. cerMsiae chromosome V. This region comprises an open 
reading frame of 4.4kb which lies between the RAD4 (Gietz & Prakash 
1988) and the poly-A binding protein (Sachs et al. 1986) gene coding 
regions. In an overlap of 1538 amino acids, the whole of the yeast open 
reading frame, there is an identity of 37.7% and a similarity of 59% (Fig. 
10). The degree of conservation this similarity implies suggests the yeast 
sequence encodes a homologue of CHD-1A that we shall term CHD-1 Y for 
the sake of discussion. 

Delmas et al., (1993) identified four motifs in CHD-1 with 
possible functional significance. CHD-1 A retains such close homology to 
CHD-1 that these regions are virtually unchanged and are likely to perform 
similar functions as they do in the mouse. 

The first motif is a chromodomain (Paro & Hogness 1991) 
which falls between residues 274 and 311 (Fig. 9). Figure 11 compares 
the amino sequence of this region to that of eight others identified through 
a search of the EMBL database. The sequences fall into three categories. 
The first comprises the domain from CHD-1, CHD-1 A and CHD-1 Y. The 
second and third chromobox groups have been previously identified by 
(Pearce et al. 1992). The HP1 class comprises the Dmsophila (James & 
Elgin 1986) and human (Saunders et al. 1993) HP1 genes and two murine 
modifier (Mod) genes (Singh et al. 1991). The HP1 class is characterized 
mainly by glutamic acid rich block of six residues upstream of the 
chromobox. The third group, the Pc class, comprises the Drosophila Pc 
gene (Paro & Hogness 1991) itself and its putative murine homologue the 
Mod3 gene (Pearce et al. 1992). 

A search of the EMBL data base with the CHD-1 A putative 
helicase domain (residues 451-911, Fig. 9) raises the identity between this 
and CHD-1 y to 55% in an overlap of 471 amino acids. There is also 
significant, but lesser identity to, the putative helicase motifs in the human 
(Okabe et al. 1992), and S. cerivisiae (Laurent et al. 1992) SNF2 gene. 
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human (Muchardt & Yaniv 1993) and Drosophila Brahma (Tamkun et al. 
1992), S. cerivisiae NPS1/STH1 (Laurent era/. 1992, Tsuchiya etal. 
1992), human excision repair protein ECCR6 (Troelstra etal. 1992) and 
the RAD54 (Emery et al. 1 991 ) and MOT1 (Davis et al. 1 992) genes of S. 
5 cerivisiae. It should be noted that none of these latter genes contain a 
chromobox. 

Only the four CHD genes show significant homology to the 
third motif, a DNA binding region identified by Delmas era/.,(1993), whilst 
only CHD-1A and CHD-1 have the three short basic HSDHR motif near the 
10 carboxy terminus, although this region is yet to be sequenced in CHD-W. 
The CHD-1Y gene apparently terminates before this point so does not 
share this motif. An extended discussion of the homology of the mouse 
CHD-1 gene can be found in (Stokes & Perry 1995). 

15 Isolation of CHD-W 

Two, CC14 and CC4, of eight CHD-1 related clones isolated 
from the 10 day chick embyro library using 349-4359nt of CHD-1 A as a 
probe, overlap (Fig. 5) to provide the 1316bp of sequence given in Fig. 8. 
This is a sequence closely related to, but distinct from CHD-1 A. Identity 

20 over the 1316bp overlap is 90.5% and 90.1% at the nucleotide and amino 
acid level respectively. An alignment of the putative translations of CHD-1, 
CHD-1 A and CHD-W\s given in Fig. 9. The amino acid identity between 
CHD-1 and CHD-1 A at 93.4% is marginally lower than that between that of 
CHD-1 and CHD-W, 94.2%, over the same region 

25 1 335bp insert of CC4 was used at moderate stringency 

to probe a male/female, Pvull cut genomic blot featuring mouse, ostrich 
(Strvthio camelus), chicken, bee-eater and hyacinth macaw (Fig. 13). 
Hybridization with the mouse and ostrich shows no evidence of any sex 
linkage, bands of the same size and equal intensity appearing in both 

30 sexes. Hybridization with the ostrich is particularly strong, greater even 
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than with the cognate sequence in the chicken. This suggests that the 
genome size of the ostrich is considerably smaller than that of the chicken. 
It also demonstrates that the CHD genes cannot be used to sex the ostrich 
and, it is suggested, the other members of the ratites. There is no 
5 evidence from further work (reported later) that this effect should occur in 
other Parvclasses of the birds (Sibley et ai 1 988). 

In all the bird species apart from the ostrich, hybridization 
occurs with two types of fragment some that are female unique and others 
that are shared between the sexes, in the chicken some of the latter are of 
10 the same size as those hybridizing with the CHD-1A probe and result from 
cross hybridization under the conditions of low stringency that we 
employed. When probed with the CC4 sequence it is clear that 
hybridization with the female linked fragments is far stronger, at least in the 
chicken than with the shared fragments (bear in mind, also, that the female 
15 chicken only has a single dosage of the W linked gene). This indicates 
that CC4 is W linked and represents part of CHD-W. 

The DNA contained in the Southern blot of the male and 
female chickens probed in Fig. 13 contained identical amounts of DNA. 
However, examination shows that the shared bands are twice as strong in 
20 males (ZZ) as they are in females (WZ). The only way this could have 
happened is if the CHD-1A gene is Z linked. It is suggested this is the 
case in all birds. 

Sex identification with PCR on dried and limited DNA in a Spixls- 
25 Macaw 

The first test was devised to sex DNA extracted from the 
feathers of the last wild Spix's Macaw. This was the rarest bird on the 
planet and needed to be sexed so a mate could be selected from the 31 
captive birds that remained. The test presented two problems. The first 
30 was extracting DNA from feathers the second providing a test that would 
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work. 

The procedure was published in Griffiths & Tiwari (1 995) 
which covers the extraction of the DNA. The second test was to provide 
DNA from a Hyacinth Macaw which would yield data to allow construction 
5 of primers. A FIX II library was provided by Stratagene and this was 
probed with the insert of the CHD-1A clone Z6 (-227-5302 Fig. 6) at 
moderate stringency. This provided 7 positive clones (A1, A2, A7, A8, 
A13, 1.2 and 5C). The inserts were extracted cut with Mbol and subcloned 
into the Baml cut pUC18. This sublibrary was probed again with the Z6 

l o insert but this time at high stringency. The A1 2.3 subclone hybridized. 
This was sequenced and contained 11 1bp which is aligned to the chicken 
and mouse CHD genes in Fig 14. The similarity of this fragment to the 
chicken CHD-W suggested this was the Hyacinth Macaw homologue of the 
W chromosome located gene. 

1 5 Tne data from A1 2.3 supplied information for the design of 

the primers required. It also provided evidence that the CHD sequences 
were sufficiently conserved in this region that a single set of primers could 
be designed to amplify both genes. Three primers, PI, P2 and P3, were 
designed to allow seminested PCR (Fig. 14). This technique allowed 

20 amplification of a 104bp region of both CHD-W and CHD-1A from DNA that 
was available from two captive Spix's Macaws of known sex. In each sex 
the PCR products were of the same size but sequence determination 
revealed that the CHD-W derived PCR product possessed a Ddel 
restriction enzyme site which was lacking in the CHD-1A product. Thus 

25 PCR amplification and Ddel cleavage of male Spix's Macaw DNA yields a 
only single product of 104 base pairs (bp), whilst from female DNA two 
products are apparent, one of 104bp and one of 73bp. The presence of 
the CHD-1A product in both sexes acts as a control to ensure the PCR 
amplification has been successful (Fig 15 & 16). 

30 DNA was extracted from feathers moulted by the wild Spix's 
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Macaw using a technique devised for the purification of ancient DNA 
(Thomas & PaSbo 1993). The PCR-based test described above was used 
to demonstrate that CHD-W was not present in the sample (see Fig 16). 
This confirmed that the wild bird is male. A female Spix's macaw was 
5 released in March 1995 as a prospective mate. 

Sex identification with PCR on a variety of birds 

Birds can be sexed from DNA by showing the presence 
(female: ZW) or absence (male: ZZ) of the female specific W chromosome. 

io At the molecular level this is earned out by the recognition of a W-linked 
marker. This can only be done after a W chromosome DNA marker is 
identified in the avian species. The test developed for the Spix's Macaw 
used CHD-lVas a W linked marker. The data collected in designing this 
test suggested that this method may work to sex a variety of birds. 

15 If the same test is to work on other bird species then two 

criteria must be met. The first is whether the PCR primers will amplify both 
CHD genes in other bird species. The Spix's Macaw test used the tiny 
amounts of DNA extracted from feathers so a seminested PCR was 
required. This used 3 primers which are aligned to the Mouse and Chicken 

20 CHD nucleotide sequences in Figure 1 4. The primer sites are highly 

conserved, there is no difference between the chicken genes and a solitary 
difference between the Mouse and Chicken in the 5* region of the P2 site. 
Theoretically, the primers should anneal to other bird species and. if a 
reasonable amount of DNA is available (>50ng), a single pair of primers 

25 should provide sufficient amplification. 

A second requirement for the test is that the PCR products 
can be separated using a restriction endonuclease. In the Spix's Macaw 
the Ddel enzyme cuts CWD-Wbut not CHD-1A. Figure 14 shows that this 
discrimination would also occur in the Chicken. However, the Ddel cutting 

30 site CTNAG is not present in the CHD-1A of Spix's Macaw (CTNGG) nor 
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the Chicken (CANAG) for different reasons. This suggests that the Ddel sit 
is open to mutation so this form of discrimination is unlikely to be 
conserved. Other discriminatory sites are available: Ddel and Maell sites 
are unique to CHD-W and the Haelll, Mboll and Xhol sites to CHD-1A and 
5 can be considered the first option If these fail the CHD-W and CHD-1A 
PCR fragments can be cloned and sequenced so discriminatory sites can 
be discovered. 

The theory we have presented suggests that a sexing test 
based on both avian CHD genes should work on many other bird species. 
10 Does this work in practice? The birds selected for trial are from across the 
avian class: Chicken (5 individuals), Marbled Murrelet (18), Kestrel (8), 
— Marsh Harrier (28), Bee-eater (4), 1 pair of six species of Strigidae Owls 
from different genera (see Methods), Starling (5) and African Marsh 
Warbler (5). 

is The primers amplify a PCR product of the predicted size in all 

of the birds using primers P2 and P3 on 50*1 OOng of genomic DNA 
extracted from blood. Figure 17 illustrates this for 3 bird species but also 
includes amplification from human DNA. This shows that tests using P2 
and P3 are open to human DNA contamination so appropriate precautions 

20 must be taken. 

The Haelll restriction enzyme cut the CHD-1A fragment alone 
in all 13 species (Fig 17) and, from the sequence data, would also have 
worked on the Spix's Macaw (Fig 16). Figure 17 shows that the CHD-1A in 
males is cut into two fragments (45bp, 59bp) which are not easily visible on 
25 the gel. In females CHD-W is uncut by Haelll so remains at 104bp. The 
discrimination using Haelll provided correct sex identification in all 
individuals. 
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Discussion 

The CHD genes 

The female specific great tit probe GT-W was described by 
Griffiths and Tiwari (1993) as a means of identifying sex in this species. 
5 The results presented here suggest this sequence represents part of a 
intron in a W linked gene. By moving downstream from this sequence it 
has been possible to isolate a putative exon from a gene that we have 
named CHD-Wdue to its close sequence identity to the mouse CHD-1 
gene (Delmas et a/. 1993) and its W location. 

io Using the CWD-Wfragment we attempted to isolate a similar, 

W linked sequence that Southern blot analysis had shown was present in 
the chicken. From several clones a 6606bp cDNA sequence was 
assembled but although it has close sequence identity to the great tit 
CHD-W fragment Southern blot analysis shows it is not located on the W 

15 chromosome. This second gene was termed CHD-1 A (A = avian). This 
blot shows a second gene closely related to CHD-1 A is W located. This 
sequence could not be cloned from a stage 10-12 chick cDNA library 
although. 19 CHD-1 A clones were isolated. However, two clones yielding 
1347bp of a second CHD gene were isolated along with a further 14 

20 CHD-1 A clones from a day 10 chick cDNA library. Southern blot analysis 
showed that this second clone was W chromosome derived and so 
represents CHD-W. Attempts are underway to isolate the remainder of 
CHD-W. 

Southern blots of a variety of bird species showed that 
25 CHD-W is W chromosome linked in all birds except the ostrich. This 
suggests that the gene is sex linked throughout the class with the 
exception of the primitive ratites, which the ostrich represents, where it 
appears to be autosomally located. 

An alternative explanation is that the CHD-W is in fact W 
30 linked in ratites but occurs in a region of the W chromosome which still 
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recombines with the Z chromosome. If CHD-1A were 2 linked, then 
recombination between Z and W linked copies of CHD would maintain their 
sequence identity resulting in the apparently autosomal location indicated 
by the Southern blot. A mammalian example would be the MIC2 and STS 
5 genes that are located in the pseudoautosomal region of the Y 

chromosome (Ellis & Goodfellow 1989) and would give analogous results 
to those observed here. 

Two lines of evidence support this alternative hypothesis. 
The first is that the Southern blot analysis suggests that CHD-1A is Z 

l o linked in non-ratites which would make the chromosomal location of the 
CHD-genes consistent throughout the class. Hybridization of CHD-1A to 
genomic blots is apparently stronger to fragments from male birds which 
would result from this sex having two copies of any Z linked gene in 
comparison to a single copy in the female (this result is not clear cut and 

15 requires confirmation by chromosomal in situ). The second line of 
evidence is that the sex chromosomes of the ratites are not 
morphologically differentiated as is the case with other birds (Christidis 
1990). Morphological similarity suggests recombination still occurs 
between extensive regions of the ratite Z and W which may include the 

20 CHD genes and so produce the pattern of hybridization observed. 

Although we have yet to clone the whole of CHD-1 A the 
6606bp sequenced so far shows a close identity to the mouse CHD-1 gene 
over the putative coding region. It also includes all four features identified 
by Delmas et al. (1993). as having possible functional significance. This 

25 includes a chromodomain, a helicase, a DNA binding motif and a basic, 
five amino acid motif that is repeated three times (Fig. 9). The similarity of 
the sequence derived thus far from CHD-W to that of CHD-1 and CHD-1 A 
suggest it will be of similar length and possess these same motifs. We 
have also identified an alternatively spliced form of CHD-1 A and CHD-W 

30 which has a similar adenine rich motif inserted at an identical point 
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(4327/4328ntCMMAand 1 31 6nt CHD-W). The exact form of these 
alternative mRNAs is yet been elucidated. It is interesting to note that we 
obtained no clones that spanned these breakpoints which contained this 
additional motif; the sequence was built up from partial sequences derived 
5 from either 5' or 3' terminii of different clones. Delmas et a/., (1 993) 

produced a mRNA Northern blot probed with fragments of CHD-1 occurring 
5' to this breakpoint and discovered an mRNA species of about 4kb. This 
would correspond to a species cleaved near this insertion point. What 
purpose this would serve is unknown. Moreover the putative yeast 

io homologue of CHD, CHD-1Y, which was identified from amino acid identity 
to CHD-1A from the genomic sequence on the EMBL database does not 
apparently have a similar motif. This is suggested because the CHD-1Y 
sequence was derived from a genomic clone which would allow the 
identification of any such sequence were it to be spliced in the normal 

15 manner. 

The significance of the four functional domains found in the 
CHD genes will be discussed in turn. The first, the carboxy-terminal trimer 
repeat of five basic amino acid residues, has no known function and is not 
shared by any other sequences from the EMBL database. Furthermore, 

20 the CHD- 1 Y gene which is truncated by a little over 200 amino acid 

residues in comparison to CHD-1 and CHD-1 A does not contain this motif. 

The second functional domain was identified by Delmas et al. 
(1993) as having sequence selective DNA binding capacity. Whether this 
is highly specific or just to A+T rich regions was not established. They 

25 also noted that this domain contains Lys-Arg-Pro-Lys-Lys and Arg-Gly-Arg- 
Pro-Arg motifs which enable genes like HMG-1, D1 and Engrailed to bind 
in the minor groove of A+T rich DNA. 

A third functional motif is located towards the N-terminus of 
the CHD-protein and is termed the chromodomain [Chromatin Organization 

30 Modifier; Paro, 1990 #459]. This is a highly conserved domain of between 
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37-50 amino acids that has been shown to be represented in the genomes 
of plants, nematodes, insects and vertebrates (Singh etal. 1991). Several 
chromobox genes have been isolated from human, mouse and Drosophila 
and have been divided into the polycomb (Pc) class and the 
5 heterochromatin protein-1 (HP1) class on the basis of related structure 
(Pearce et ai 1992)). The CHD-genes have a distinct form of the 
chromobox characterized by close homology between yeast and vertebrate 
forms in the 5' half of the box itself but extending a further 17 residues 
downstream. These differences indicate that this form of the chromobox 

io defines a third subgroup the CHD class 

The Pc gene forms one of a eponymously named group (Pc- 
g) of about 12 genes defined through homeotic mutants in Drosophila that 
prevent fixation and maintenance of a determined state. They act as 
transcriptional repressors of homeotic genes, notably of the antennapedia 

15 complex (ANT-C; Paro, 1990). Members of the ANT-C and the other major 
group of Drosophila homeotic genes, the bithorax complex (BX-C), are 
responsible for defining segmental identity during development (Kaufman 
et a/. 1980, Lewis 1978). Initially, their expression patterns are designated 
by early acting maternal and segmentation genes (see 4,6,7 kennison). 

20 However, these maternal genes are only transiently expressed. During the 
later stages of development their role as transcriptional activators is 
adopted by an assemblage of genes including the trithorax group (Trx-g), 
whilst many of their repressive effects are assumed by the Pc-g (Kennison 
1993), 

25 The polycomb (Pc) gene itself is perhaps the best studied 

member of the Pc-g. Zink and Paro (1989) used Pc-B-galactose fusion 
proteins to show that it binds to around 100 different sites on the polytene 
chromosome including loci where other members of the Pc-g are located. 
Any disruption of the chromodomain abolishes the specificity of this 

30 reaction (Messmer ef a/. 1 992). However, the Pc-g protein appears to lack 
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any type of endogenous DNA binding capacity so it is thought that it acts 
as part of a protein complex with other components that are responsible for 
the site specific DNA binding (Paro 1990). 

The repressive effects of the Pc-g are thought to be the result 
5 of chromatin compaction. In other words, the DNA is packaged into 
heterochromatin to prevent or reduce the expression of functional genes 
(Paro 1 990). This is a mechanism related to position effect variegation 
(PEV; (Henikoff 1990)), to dosage compensation in mammals which sees 
the complete heterochromatization of one of the female's X chromosomes 

io and possibly to gene imprinting whereby the expression of maternally and 
paternally inherited alleles differs (Peterson & Sapienza 1993). The links 
with PEV have recently been substantiated in that HP1, a recognized 
modifier of PEV, and Pc both contain chromodomains (Paro & Hogness 
1991). Like the Pc protein, HP1 appears to form part of a structural 

15 complex that transforms euchromatin to heterochromatin. Furthermore, 
both PEV and the repressive effects of Pc are passed, in a clonal manner, 
to daughter cells ((Henikoff 1990. Struhl 1981); a characteristic also of 
gene imprinting. 

With the CHD-type gene containing both a DNA binding motif 
20 and a chromobox it may appear reasonable to suggest that they encode 
repressors with an endogenous, site selective DNA binding system. 
However, CHD genes contain a further functional motif that is structurally 
related to the Helicases. The sequence identity is closest to the yeast 
SNF2/SWI2 (Abrams et a/. 19861 and 0rosQpiw/aBra/jmagenes.(Tamkun 
25 et al. 1992), both of which are transcriptional activators. Indeed, Brahma is 
part of the Trx-g which are considered direct antagonists to the Pc-g. 
Other genes which contain more distantly related Helicase domains are 
involved in DNA repair and chromatid separation during mitosis (Laurent et 
al. 1993, Sung et al. 1993). 

30 Tne SWI2 gene product has been shown to enhance the 
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transcription of other genes probably as part of a complex that includes 
SWI1, SWI3, SNF5, SNF6 and in conjunction with gene specific DNA 
binding proteins (Laurent et a/. 1991, Peterson & Heskowitz 1992). A 
mode of action strikingly similar to that of Pa 
5 Although it remains to be formally demonstrated that SWI2 is 

a helicase, it does have close structural similarities with proven Helicase 
genes and also possesses the required DNA stimulated ATPase activity 
(Laurent et al. 1993). Laurent et a/., go on to postulate that the SWI2 
containing complex may act by two mechanisms acting either separately or 
10 in conjunction. In the first they envisage helicase mediated DNA melting to 
allow the egress of RNA polymerase II. Alternatively SWI2 could allow 
chromatin remodelling, in effect overcoming any inhibitory packaging of the 
DNA and so enhancing transcription. 

The juxtaposition of a Helicase and a chromodomain within 
15 the same gene presents a paradox that may challenge the perceived roles 
of the two motifs. A simple explanation is that alternative splicing could 
remove one or other of these domains prior to translation. However, there 
is little support for this idea from the work of ourselves or Delmas et a/., 
(1993). 

An alternative explanation could be due to our lack of real 
knowledge about the function of the chromobox. Whilst it is well 
established that Helicases do disassociate DNA and so facilitate 
transcription (Matson & Kaiser-Rogers 1990), the role of the 
chromodomain in repression is based on more circumstantial evidence. 
Pc t as we have seen, does not bind DNA itself although mutations in the 
chromobox prevent the formation of site specific complexes. It is possible 
that the chromodomain is involved more in maintaining the structural 
integrity of the repressive complex than in the repressive mechanism itself. 
Based on this supposition, the CHD-protein may form a different type of 
complex able to bind at a site dictated or influenced by its own binding 
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domain and activate these loci via helicase activity. 

While both this scenario is speculative it is probable that 
CHD-type genes are active during development and are able to bring 
about heritable changes in transcription. The presence of an endogenous 
DNA binding domain suggests it has fewer targets than Pc, for example, 
which could form part of several different active complexes. With CHD-W 
being confined to the W chromosome is likely to have a role in some 
aspect of female development and we suggest this may be critical to the 
determination of gender. In support this hypothesis we were unable to find 
any CHD-W clones in a library constructed prior to sex determination which 
occurs at day 7 (Lutz-Ostertag 1954) but were able to isolate two clones 
from a smaller pool of candidates at day 10. This suggests that the 
expression of CHD-Y may occur at a time consistent with its having a sex 
determining role. 

If CHD-W alone or in conjunction with CHD-1A causes sex 
determination in birds then several potential mechanisms are plausible. 

(1) In the simplest scenario CHD-Y may act as a simple trigger 
like SRY (Koopman 1993) to either cause expression or repression of 
downstream genes in order initiate testis development. 

(2) CHD-Wmay interact with other autosomal or Z linked genes 
whereby the dosage of CHD-Win comparison these other factors causes 
initiates development down the male or female pathways. 

A more complicated scenario is if CHD-W acts in together 
with CHD-1A to cause sexual differentiation. Different mechanisms, could 
operate depending whether CHD-1A turns out to be Z linked as we suspect 
or autosomal. 

(3) If CHD-1 A is Z linked, then male birds get two doses of the 
CHD-1A expression product to one in female birds. Perhaps the 1:1 ratio 
of functionally distinct CHD-1 A and CHD-W products is what initiates 
female development whilst a double dosage of CHD-1 A results in males. 
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(4) Alternatively, just the single dosage of Z linked CHD-1A 

product could result in female development and expression of CHD-W only 
occurs after sexual differentiation to equalize dosages of functionally 
similar proteins. 

5 (5) If CHD-1A is autosomal however, it could be envisaged that 

CHD-1A and CHD-W are functional homologues and the three doses in 
females (AAW) is required to promote female development, whilst the 
double dosage in males (AA) causes the differentiation of the testis and the 
development of the male phenoype. 

10 The evidence from aneuploid chickens discussed in the 

introduction, does suggest that the mechanism that does operate involves 
some degree of dosage dependence which tends to exclude mechanism 
(1). However the similarity of CHD-W to HP1, the Pc protein and other 
transcriptional modifiers that act through chromatin remodelling show that 

15 the expression of this type is crucially dependent on dosage (Locke et a/. 
1988) . With the different dosages of gene product and/or potential target 
sites that aneuploids possess it may be that analysis of these type of 
mutants has, thus far, served to confuse the issue. 

20 Sex Identification 

The first W-chromosome linked DNA was isolated by Tone et 
a/. (1982) from the Chicken. Since then, a number of other W-linked avian 
sequences have been discovered (e.g. Griffiths, 1990; Rabenold, 1991; 
Griffiths, 1993), In all but one case,.describecHater, these Defragments 

25 appear to be non-functional repeats. For instance, the related Xhol and 
EcoRI fragments in Chicken may comprise 70-90% of the W chromosome 
(Saitoh etal. 1991). This repeat and others in the Lesser Black-backed 
Gull {Lams fuscus) can be used to sex birds by the rapid dot blotting 
technique (Griffiths & Holland 1990). Other less repetitive W chromosome 

30 markers can be used to sex birds either by probing Southern blots 
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(Rabenoldefa/. 1991) or through the use of PCR (Griffiths ATiwari 1993). 

The major problem with all non-functional W-linked DNA is 
the speed with which they evolve. The chicken Xhol repeat is fairy typical. 
Through low stringency hybridization to a Southern blot it can be used to 
sex the Turkey (Meleagris gallopavo) and the Pheasant (Phasianus- 
versicolor, Saitoh et a/. 1991). These bird species are closely related to 
the Chicken by being members of the family Phasianidae. By contrast, the 
functional CWD-Wregion described here is 96% (3/67 Fig 3) identical 
between Chicken and Spix's Macaw and this only drops to 86% between 
the Chicken CHD-Wand the Mouse CHD1 (15/110 Fig 3). This level of 
conservation means that the chicken CHD-W probe can be used on 
Southern blots to sex birds from all over the class Aves. 

The only exception to the non-functional avian W-linked 
sequences is DZWM1 which is a putative gene, cloned from a cDNA turkey 
library. Like CHD-WMs gene appears to be sex linked in many bird 
species. Unfortunately, so little information has been published in the 
papers that describe DZWM1 that the nature of the gene remains unknown 
(Dvorak et al. 1992, Halverson 1990, Hafverson & Dvorak 1993). 

For sexing large numbers of birds Southern blot analysis is 
slow and expensive. The technique that we have used is based on a PCR 
using P2 and P3 primers followed by a Haelll digestion of the of the 
amplified product. The digestion distinguishes between the CHD-W 
product which is uncut and the CHD-1A which is cut. The technique will 
work to sex a range of bird species that sparxthadass. Ave* The-primers- 
target a highly conserved region so are likely to be 'universal' to the birds 
but the discriminatory Haelll site which cuts CHD-u but not CHD-W 
shows no real reason to be conserved. If Haelll does fail to be 
discriminatory other cutting sites have been suggested or the CHD-W and 
CHD-1A PCR products can easily be sequenced to look for an alternative. 
Alternatively, the different nucleotide sequence of the amplified CHD-W 
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and CHD-1A suggests that the two PCR products would be separable on 
an agarose gel of around 3% or a non-denaturing acrylamide gel. This 
would remove the need for a cutting enzyme and may well make the 
sexing technique more easy to use. 
5 The CHD based test appears to be fairly solid but the 

chances of a peculiar mutation in some bird species is not impossible. 
Cases concerning SRY7Sox3 genes on the sex chromosomes in mammals 
supports this claim. In two species of the vole Ellobius males have neither 
a Y chromosome nor an SRY gene (Just ef a/. 1995). In a second case, 

10 four species of Akodon, the Mole Vole, have 15-40% of fertile females with 
XY chromosomes and an SRY gene (Bianchi et a/. 1993). These 
examples are particularly peculiar in that the SRY gene is accepted as the 
gene that determines sex throughout the mammals. In neither case would 
the detection of SRY reliably inform you of the animals sex. 

15 These examples from the Muridae may never occur with the 

CHD genes of birds. However, it does suggest that sex identification by 
the amplification of CHD+W and CHD-NW should always be validated by a 
test on several individuals in a new species before it is applied. Despite 
this warning, the use of the test described here or by other means using 

20 the CHD-Wox CHD1A, these genes provide a method to sex most bird 
species. 
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Table 1. Sex of domestic fowl with normal and abnormal chromosome 
complements (from McCarrey & Abbott (1979) and Crew (1954)). 



Chromosome complement 


Phenotype 


AAZZ 


Male 


AAZW 


Female 


AAZZW 


Male? 


AAZZZ 


Male 


AAAZZZ 


Male 


AAAZZW 


Intersex/male 



Figure Legends 
Figure 1. The DNA sequence of the pGT-W insert. 
Figure 2. A map of the 9.6kb insert of the IFixll clone isolated 
from the great tit using pGT-W. pGT1.7 and pGT8 are the two EcoRI 
subclones into which the fragment was divided. The broken line 
corresponds to the region with absolute sequence identity to the pGT-W 
insert. The position of the region with identity to the mouse CHD-1 gene is 
indicated. 

Figure 3. An alignment of 123bp fragment of the great tit 
(GT) CHD-W gene in pGT8 with the autosomal/Z located chicken (C) CHD- 
1A the chicken CHD-Wgene and bases 3855-3977 of the mouse (M) 
CHD-1 gene. An alignment of the deduced amino acid sequence is also 
given. 

Figure 4. The section of pGT8 that hybridized to a female 
specific fragment of 3.1kb in the chicken. This probe was also used to 
screen the chicken cDNA library. The hatched line represents the female 
specific great tit motif shown in Fig. 3. 
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Figure 5. The complete nucleotide sequence of CHD-1 A as 
defined by the clones Z4, Z6 and Z11. Two asterisks underlie the position 
where part of the sequence illustrated in Fig7 is spliced onto the 5' or 3' 
ends of a proportion of the clones isolated. The ATG at nucleotide 228 is 
5 the start codon whilst JAA at 5388 is the stop codon. 

Figure 6. The strategies used to determine the nucleotide 
sequence of CHD-1 A and CHD-W given in Fig. 5 and Fig. 8. The top line 
represents the mouse clone given by (Delmas et al. 1993). The three 'Z' 
clones of CHD-1 A and the 'CC4' and 'CC14' clones of CHD-Wwere 

io derived from either a stage 10-12 or a 10 day chick cDNA library 
respectively. Arrows indicate the direction of sequence determination. 
Note Z6 actually ran from -227 to 69. These nucleotides were determined 
and are found in Fig 5 

Figure 7. A composite nucleotide sequence and putative 

15 translation of the motif that is found spliced to a proportion of the 5' or 3' 
terminii of CHD-1 clones or the 3' end of the CHD-W clone CG14. The 
portion attached to the CC14 sequence is incomplete. 

Figure 8. A partial nucleotide sequence of CHD-W as 
defined by the clones CC4 and CC14. 

20 Figure 9. An alignment of the deduced amino acid 

sequences of the chicken (C) CHD-1 A and CWD-W with the mouse (M) 
CHD-1. With gaps introduced to maximize alignment they show a 
sequence identity of 91.6% over 1365 residues. The $ sign indicates start 
and stop codons. Boxed sections are the chromodomain-(C>. Helicase (H). 

25 and the region containing the DNA binding domain (B) identified by Delmas 
et al., (1993). A trimer repeat of a basic HSDHR motif is underlined. A* 
denotes residue identity and . similarity. 

Figure 1 0. An alignment of the deduced amino acid 

sequences of CHD-1 A and CHD-1 Y a putative yeast homologue of the 

30 chicken gene identified through a search of the EMBL data base. With 
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gaps introduced to maximize alignment they show a sequence identity of 
37.7% over 1 538 residues. | indicates identity and : conservative 
substitution. 

Figure 11 Comparison of 9 chromodomain sequences. 
5 Vertical lines indicate the extent of the chromodomain as defined by Paro 
& Hogness (1991). The top three sequences represent the CHD class of 
chromodomain to add to the HP1 class and Pc class][;-l08k9ouygytrdevz 
as defined by Pearce et al. (1992). The first letter of each annotation 
indicates the animal of origin: C, chicken; M mouse; D, Drosphila; H, 

io human; Y, S. cerivisiae whilst the remainder identifies the gene type. The 
yeast gene is a possible CHD homologue identified by its close identity to 
the vertebrate forms. * indicates sequence identity within the groups and A 
identity between all nine sequences. * indicate amino acid residues inside 
and downstream of the motif that are characteristic of the CHD class 

15 chromobox. 

Figure 12. Genomic Southern blots of DNA from male and 
female chickens and lesser black-backed gulls digested with Pvull and 
probed with a 433bp Hindlll/Sac fragment of pGT8 (Fig 4.) at moderate 
stringency. Hybridization with female linked fragments and fragments 

20 common to both sexes can be observed in both species. Numbers give 
approximate sizes in kilobases. 

Figure 13. Genomic Southern blots of DNA from male (M) 
and female (F) mice, ostrich, chicken, bee-eater and hyacinth macaw 
probed with the 1335bp insert of CC4 at moderate stringency. 

25 Hybridization with mouse and ostrich is with fragments shared by both 
sexes whilst the non-ratite birds show additional hybridization to female 
specific fragments. In these latter species, the signal from female linked 
hybrids is stronger than with autosomal/Z linked fragments indicating that 
the probe is derived from the W chromosome. Numbers give approximate 

30 sizes in kilobases. 
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Figure 14. The nucleotide sequence of part of a single CHD1 
gene isolated from the Mouse and the homologous genes from the 
Chicken, Hyacinth (A12.3 subclone) and Spix's Macaw all arranged as 
putative codons. Dashes denote nucleotides shared with the Mouse 
5 sequence. The primers designed are shown on the diagram; An arrow 
head indicates a non-synonymous mutation in the Spix CHD-W. The Ddel 
(CTNAG) and Haelll (GGCC) sites are underlined. 

Figure 15. The technique of PCR sex identification in the 
Spix's Macaw. Semi-nested PCR amplification is carried out on both sexes 

10 with the primers P2/P3 then P1/P2 to provide products of identical sizes in 
both sexes. The products are then cut with restriction enzyme Ddel which 
cuts only the CHD-W product from the female. The cut products are run on 
a visigel and the difference between the sexes can be visually detected. 
See Fig 17 for an example. 

15 Figure 16. Ddel restricted PCR products demonstrating that 

remaining wild Spix's Macaw is male. Lane 1. the wild bird 2. negative 
extraction control 3. known male 4. known female. The larger fragment is 
of 104 bp and the female W-chromosome specific fragment of 73 bp. 

Figure 17. Sex identification in the Marsh Harrier (MH), 

20 Chicken (C) and African Marsh Warbler (AMW) carried out using an 
identical reaction. For each species genomic DNA of male and female 
birds was subject to PCR with primers P2 and P3 and the product of 1 10bp 
is visible in lanes 1 and 2. In lane 3 the entire male PCR product, amplified 
from CHD-1A, has cut into two parts with Haelll (65bp, 45bp). In females, 

25 lane 4 this Haelll cut product is also present but the CHD-W product 

remains uncut so the sex can be identified. The 'Kb' lane contains a '1Kb 
DNA ladder 1 (BRL), the 'H' lane is PCR reaction with P2 and P3 carried out 
on human genomic DNA and -ve lane contains a negative PCR reaction. 
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CLAIMS 

5 1 . The nucleotide sequences of CHD-1 A and CHD-W as shown 

in Fig. 5, Fig. 7 and Fig. 8. 

2. A clone or subclones of CHD-1 A and CHD-W as defined in 
1. 

3. A fragment of CHD-1 A and CHD-W capable of giving W 
10 specific signal on hybridization to a non-ratite bird. 

4. A fragment of CHD-1 A and CHD-W obtainable by restriction 
endonuclease digestion thereof and being capable of giving a W specific 
signal on hybridization to genomic DNA of a non-ratite bird. 

5. A clone or subclone of a fragment according to either of 
15 claims 3 and 4. 

6. A nucleic acid or fragment or oligonucleotide having 
substantially the sequence of CHD-1 A and CHD-W as set out in Fig. 5, Fig. 
7 and Fig 8. 

7. A clone or a subclone of a nucleic acid or fragment or 
20 oligonucleotide according to claim 6. 

8- A nucleic acid or fragment or oligonucleotide having 
substantially the same sequence of the chicken or great tit CHD-gene as 
set out in Figs 1,3, 5, 7 or 8. 

9- A nucleic acid of fragment or^ oligonucleotide- being- capable of 
25 giving a W chromosome specific signal on hybridization to the genomic 

DNA of a non-ratite bird. 

1 0- A nucleic acid or fragment or oligonucleotide according to 
claim 4 or claim 9 capable of giving W chromosome specific signal on 
hybridization to the genomic DNA of a chicken, turkey, duck, parrot. 
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11. A nucleic acid or fragment or oligonucleotide according to any 

one of claims 4, 9 and 10 capable of giving W chromosome specific signal 
on hybridization to the genomic DNA of a non-ratite bird under conditions 
of high stringency. 

5 1 2. A nucleic acid or fragment or oligonucleotide according to any 

one of claims 4, 9 and 10 capable of giving W chromosome specific signal 
on hybridization to the genomic DNA of a non-ratite bird under conditions 
of low stringency. 

13. A nucleic acid or fragment or oligonucleotide according to any 
10 one of the claims 9 to 1 3 containing substantially the sequence of the 

chicken CHD-gene as set out in Fig. 5, Fig. 7 and Fig. 8. 

14. A nucleic acid or fragment or oligonucleotide encoding a 
CHD-protein, fragment thereof or polypeptide containing a CHD-gene or 
part thereof or encoding a CHD-mimetope protein or fragment thereof or 

is CHD-mimetope polypeptide. 

15. A process for ascertaining the sex of an embryo, foetus, cell, 
tissue or organism comprising hybridizing a nucleic acid of fragment or 
oligonucleotide according to any one of claims 1 to 14 with DNA or RNA of 
the embryo, foetus, cell, tissue or organism or with cDNA reverse 

20 transcribed from RNA of the embryo, foetus, cell, tissue or organism or 
with cDNA or DNA amplified by cloning or polymerase chain reaction from 
DNA or RNA of the embryo, foetus, cell, tissue or organism. 
16 - Use of a nucleic acid or fragment or oligonucleotide of any 

one of claims 1 to 14 in ascertaining the sex of an- embryo, foetus, celh- 

25 tissue or organism. 

17. A process for controlling the sex of the progeny of an 

organism comprising inserting a nucleic acid or fragment or oligonucleotide 
of any one of claims 1-14 into the genome of the organism or progenitor 
thereof 
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1 8. Use of a nucleic acid or fragment or oligonucleotide of any 
one of claims 1 to 14 in controlling the sex of the progeny of an organism. 

1 9. A CHD-protein, fragment thereof or polypeptide containing a 
CHD-gene of part thereof or a CHD-mimetope protein, fragment thereof or 

5 a CHD-mimetope polypeptide. 

20. A protein or fragment thereof or polypeptide containing a 
CHD-chromobox including at least one of the characteristic amino acid 
residues at position 1 1 , 12, 20, 27 or 31 inside the chromobox or 3, 6 t 8 f 
12-15 or 16 directly downstream of the chromobox when aligned to best " 

10 effect and as set out in Fig. 1 1 . 

21 . A protein or fragment thereof or a polypeptide encoded by a 
nucleic acid or fragment or oligonucleotide according to claims 1-14 and 
containing a CHD-chromobox 

22. A process for controlling the sex of the progeny of an 

15 organism comprising supplying exogenously to a cell of the organism or a 
progenitor of the organism a protein or fragment thereof or a polypeptide 
according to any one of claims 19-21 

23. A process according to claim 22 wherein the protein or 
fragment thereof or polypeptide is supplied and activates a CHD-1A or 

20 CHD-Wtarget gene. 

24. An antibody or fragment thereof against a protein or fragment 
thereof or polypeptide according to any one of claims 19-21 . 

25. An antibody producing cell capable of expressing an antibody 
or fragment thereof according-to daim 24. 

25 26. Use of a protein or fragment thereof or polypeptide according 

to any one of claims 19-21 or antibody or fragment thereof or cell 
according to claims 24 or 25 in ascertaining the sex of an embryo cell 
tissue or organism. 
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27. A transgenic or chimeric animal having a heterologous 
nucleic acid or fragment or oligonucleotide according to any one of claims 
1 to 14 in the genome of at least the germ cells of the animal. 

28. Gametes of an animal according to claim 27. 

29. Progeny of an animal according to claim 27. 

30. Progeny according to claim 29 which are transgenic or 
chimeric and have a heterologous nucleic acid or fragment according to 
any one of claims 1-14 in the genome of at least the germ cells of the 
progeny. 

31 . A method of controlling the population of a species of bird 
which comprises introducing an individual member of the species into the 
population, said individual having a copy or copies of a nucleic acid 
fragment or oligonucleotide according to any one of claims 1 to 14 
integrated on a chromosome (carrier chromosome) be it sex linked or 
autosomal whereby when the male breeds with other individuals of the 
population the progeny are substantially of one sex or are sexually 
dysfunctional intersexes. 

32. A method according to claim 31 where the nucleic acid 
integrated into the carrier chromosome is homologous to the native 
CHD-1A or CHD-W gene of the bird. 
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Figure L 

CCCGGTCGGAGGTTTCAAGGAATGACTAGATGTGGCACTTAGTGCCATGGTCTAGTTGAC 60 

AAGGTGATGGTTGGTCAAAAGTTGGACTCGATGATCTCAGAGTTTTTTTCCAGCCTTAAT 120 

AATTCTATGAATTCTGTAATTTTATTCTTGATCTTTTTGAGCGAAGTTTGTTTGGGGATT 180 

TTAGTTTGGTTTCCCTGTCACTGTTTTCTTTCCTTGAAACTGACTTTCATTTGCAACATG 240 

AGAATTGCTGTATTTGTCAGGTTACAAGTAGTGCAATGGCTGCTTA^ 300 

CATTTAGGGAAATACT<KIAGTGAAGCAAACACAGTGGTACTGCCAAACTGTAGCTTTGGG 360 
ATTTGAGGAGCCACAGAGTTGTATATAAATTTGTTTAATGATATCCTGCCCCTGCCTTCC ' 420 

ATTAATTGCTTGTTTTATGAAACCACTCTTTTTTTTTTTTTTTTTTTTTTGGCTTCTTCA 480 

TATCCTGTGGTAATGAGTTAATGCATTTAGAAGCACATGGCAGAACTAGGAGATCTGTGG 540 

ATGACAGTGGTACAGGAGCTCTGAATTTTTTAGATAAACTATGAGAGTGGAAACAGAAAT 600 

CTGAGGCTAGTTTCTTGAGCTGACTGTAAATTTTGTGAGAATATTTTCAAGACTACATTA 660 

GTTGTGTGTTTGAGGAAAAATAAAATGTTTAAGTTGTCCATTCCTTGAAACCTCCCGACC 720 

GGG 723 
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Figure 2. 



123bp region with identity to CHD-1 

— h-± 

p6T1/7 P6T8 



Figure 3. 



M 


CBD-1 


ATTCTTCCAG 


C 


CBD-1A 


ATTTTACCTG 


c 


CBD-W 


ATTTTACCTG 


GT 


CBD-W 


ATTTTACCTG 


M 


CBD-1 


CAAAAAACCA 


C 


CBD-lA 


CAAGAAACCC 


c 


CBD-W 


CAAGAAACCC 


GT 


CBD-W 


CAAGAAACCA 


M 


CBD-1 


AACTACTTAG 


e 


CBVT-lA 


AATTACTGAA 


c 


CBD-W 


AATTACTGAA 


GT 


CBD-W 


AATTACTGAA 



ATGATCCTGA TAAAAAACCA 
ATGATCCAGA CAAGAAACCC 
ATGATCCAGA TAAGAAACCC 
ATGACCCAGA TAAGAAACCA 

CAAGCAAAAC AGTTACAGAC 
CAGGCAAAGC AGCTACAGAC 
CAGGCTAAGC AGTTACAGAC 
CAGGCAAAGC AGTTGCAGAC 

CAGAGATCTT GCAAAAAGAG 
TAAAGACCTT GCAAGAAAGG 
TAAAGACCTT GCAAGAAAGG 
TAAAGACCTT GCAAGAAAAG 



CAAGCAAAAC AGTTACAGAC 
CAGGCAAAGC AGCTACAGAC 
CAGGCTAAGC AGTTACAGAC 
CAGGCAAAGC AGTTGCAGAC 

CCGTGCAGAC TACCTCATCA 
CCGTGCAGAC TACCTCATTA 
CCGTGCAGAT TACCTCATTA 
CCGTGCAGAT TACCTCATTA 

AGGCTCAGAG ACTTTGTGGT GCG 
AAGCACAAAG GCTTGCTGGT GCA 
AAGCACAGAG ACTTGCTGGT GCA 
AAGTGCAAAG ACTTACTGGT GCA 



M CHD-1 ILPDDPDKKPQAKQLQTRADYLI KLLSRDLAKKBAQRLCGA 

C CHD-1 A I LPDDPDKKPQAKQLQTRAD YL I KLLNKDLARKEAQRLAGA 

C CHD-W I LPDDPDKKPQAKQLQTRADYL I KLLNKDLARKEAQRLAGA 

GT CHD-W I LPDDPDKKPQAKQLQTRADYL I KLLNKDLARKEVQRLTGA 

************************** **** * *** ** 
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Figure 5. 



1 CGGGCTGCGG CACGAAGCGC ACCGCCGGCG CACGCAGGCT CGGGCCaaar 

,m CCGCCGMCC ^GCACGC SS SS 

}° T?^IC T G TAGAGAATAG CAAGTCAAAC GCATTACTTC GAAAACATAC 

151 GGAGTACCAG AAAGGGGATT CTTGACCTAC ACCTTGTAAC CTGAGTGGAC 

201 TTTCTTTTTA ACTTCTTAAT ACTTACAATG AATGGGCACA GTGATGAAGA 

251 AAGTGTAAGA AACAGCAGTG GAGAGTCAAG CAGATCAGAT GATGATTCTG 

301 GGTCAGCTTC AGGTTCTGGA TCTGGTTCAA GCTCTGGAAG CACTACrrlS 

ll\ ?^ CTAGC * GCCAGTCAGG TAOCAGTGA^ 

401 AGGCAGTCAA TCCGAATCAG AGTCTGACAC ATCTAGAGAG AAGAAACAAG 

451 TTCAAGCTAA ACCTCCGAAA GCTGACGGAT CTGAGTTTTG GAAGTCCAGT 

501 CCAAGCATAC TTGCTGTACA GAGATCAGCA GTGCTCAAGA AGCAACAGCA 

55 ACAGCAAAAA GCAGCATCAT CAGACAGTGG TTCAGAAGAG GACTCATCCA 

601 GTAGTGAAGA TTCTGCCGAT GATTCGTCCA GTGAAACTAA GAAGAAAART 

651 CATAAAGATG AAGACTGGCA AATGTCAGGG S« ™S^Aa"c 

701 TGGTTCTGAT TCTGAATCGG CGGAAGATGG GGATAAAAGC ACTTGTG^G 

751 AAAGTGAATC TGACTATGAG CCAAAAAACA AAGTCAAAAG CCGTAAACCT 

^^S° AGAA WAAGCCAAA AAGTGGGAAA AAGAGCACAG GACAGAAGAA 

851 GAGGCAACTT GATTCATCAG AGGAGGAGGA GGACGATGAT GAAGATTATG 

901 ATAAGAGAGG ATCTCGTCGC CAGGCAACAG TGAATGTTAG TTACAAAGAA 

551 GCTGAAGAAA CCAAGACAGA TTCTGATGAT TTGCTGGAAG TTTGTGGAGA 

%£lfl C r£r ^J^ G AAGATGAATT TGAaSa GAGAAG^a" 

T?aJ I?^- 07 ^ aattggccga aaaggagcca ctggtgcctc aaccaccatc 
1101 tatgccgttg aggcagatgg tgacccaaat gctgggtttg aaaagtcaaa 
ggagctggga gaaatacagt atcttattaa atggaaaggc iSSSSS 

1201 TCCATAACAC TTGGGAAACT GAAGAAACGC TGAAGCAACA AAATGTTAAA 
^AC-^CAA CTACAAGAAA AAGGATCAGG AGACAAAACG 
1301 CTGGCTGAAA AATGCTTCTC CAGAAGATGT GGAATATTAT AACTGCCART 
«" AGGAGCTTAC AGATGATCTG CACAAACAAT ATCAAaS 
1401 ATTGCTCATT CAAATCAAAA GTCAGCAGCT GGTTATCCGG ACTACTATTG 
1451 CAAATGGCAG GGTCTGCCTT ACTCAGAATG TAGCTGGGAA GATGGTGCTC 
1501 TCATTGCCAA AAAGTTTCAG GCACGCATTG ATGAGTATTT TAGCAGAAAT 
1551 CAATCCAAGA CTACTCCCTT TAAGGACTGC AAGGTTCTAA AACAGAGACC 
1601 AAGATTTGTT GCACTGAAGA AGCAACCATC TTACATTGGA GGACATGAAA 
1651 GTCTGGAGTT AAGAGATTAT CAGTTAAATG GATTGAATTG GCTCGCTCAT 
1701 TCATGGTGCA AAGGAAATAG TTGTATTCTT GCAGATGAAA TGGGTCTGGG 
1751 TAAAACAATA CAAACAATTT CTTTTCTGAA CTACCTGTTT CATGAACATC 
1801 AACTGTATGG CCCTTTTCTT CTGCGCGTGC CACTTTCTAC CTTGACATCT 
1851 TGGCAAAGAG AGATTCAAAC TTGGGCTCCT CAGATGAATG CTOTACTTTA 
\ S °] CTTAGGAGAT ATAACTAGTA GAAATATGAT AAGGACTCAT GAATGGATCC 
1951 ATCCACAGAC TAAACGATTA AAGTTTAACA TACTTCTGAC GACATATGAA 
ATTTTACTGA AGGATAAGTC ATTCCTTGGT GGTCTCAATT GGGCATTCAT 
2051 AGGAGTTGAT GAAGCTCATC GTTTAAAAAA TGATGACTCT CTTCTOTACA 
2T0T GGACTTTAAT AGACTTTAAG TCCAACCATC GACTTCTGAT TACTGGAACC 
2151 CCACTGCAAA ATTCCCTCAA AGAGCTGTGG TCTTTGTTGC ATTTCATCAT 
2201 GCCAGAAAAA TTTTCCTCCT GGGAAGATTT TGAAGAGGAG CATGGCAAAG 
2251 GAAGAGAGTA TGGTTATGCA AGTCTTCACA AAGAGCTTGA ACCATPTTTA 
2301 CTAAGAAGAG TTAAAAAAGA TGTAGAAAAG TCTTTACCTG CTAAGGTTGA 
"51 ACAAATTCTG AGGATGGAAA TGAGTGCATT GCAGAAGCAA TATTACAAGT 
acttt ISrS AAAGCCCTCA GTAAAGGTTC AAAAGGCAGT 

2451 ACCTCAGGCT TTCTGAACAT TATGATGGAA CTTAAGAAGT GTTGTAACCA 
"01 TTGCTACCTC ATTAAGCCAC CAGATGATAA TGAATTCTAT AATAAACAGG 
2551 AGGCCTTACA GCATTTGATA CGTAGCAGCG GGAAACTAAT CCTTCTTGAC 
2601 AAGCTACTGA TTCGTCTGCG AGAACGTGGC AACAGAGTTC TGATTTTCTC 
2651 TCAGATGGTG AGGATGCTGG ACATCCTAGC AGAATATCTG AAGTATCGCC 
2701 AGTTTCCCTT CCAGAGACTT GATGGATCAA TAAAAGGGGA ATTGAGGAAG 
2751 CAAGCACTGG ATCATTTCAA TGCAGAAGGA TCAGAGGATT TCTGTTTTTT 
2801 ACTGTCTACA AGAGCTGGAG GATTAGGTAT TAACTTGGCA TCTGCTGACA 
"5 CTGTAGTTAT TTTTGATTCT GACTGGAATC CACAGAATGA TCTGCAGGGA 
"0! CAGGCGAGAG CTCATAGAAT TGGACAGAAG AAACAGGTTA ATATTTATCG 
2951 GCTAGTCACA AAAGGATCAG TAGAAGAAGA TATTCTTGAA AGAGCCAAGA 
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3001 AGAAGATGGT GCTAGACCAT 

3051 AAAACTGTTC TGCATACAGG 

3101 TAAAGAAGAG TTATCAGCTA 

3151 AAGAACCTGA AGGAGAAGAA 

3201 ATCTTGAAGA GAGCTGAAAC 

3251 AGGGGATGAG TTGCTTTCAC 

3301 ATGAAGATGA TATTGAGTTG 

3351 GAAATCATCC CAGAATCCCA 

3401 AAAAGAACTT GAAGAAATAT 

3451 AACAGATCAG CTTTAATGGG 

3501 TATTCTGGAT CTGATAGTGA 

3551 GCGTGGAAGA CCTCGAACCA 

3601 ATGCAGAGAT CAGGCGGTTT 

3651 CTGGAAAGGT TAGATGCTGT 

3701 TGAGACAGAC CTTAGACGTT 

3751 AGGCTTTAAA GGACAATTCA 

3801 GGGAAAGTTA AAGGCCCAAC 

3851 AAAACTAGTC ATCTCTCACG 

3901 TTCCTTCAGA TCCAGAAGAA 

3951 AAGGCTGCTC ACTTCGATAT 

4001 GTTAGTAGGC ATCTATGAAT 

4051 TGGATCCAGA TCTCAGCTTA 

4101 AAGAAACCCC AGGCAAAGCA 

4151 ATTACTGAAT AAAGACCTTG 

4201 CAGGCAATTC CAAGAGAAGG 

4251 GCTTCAAAAA TAAAAGAAGA 

4301 AGAAAAATCT GATGAAGATG 

4351 TCAAATCTGA AAATAAAGAA 

4401 CCAGTTCATA TTACTGCAAC 

4451 TGAAGAACTC CATCAGAAGA 

4501 CTGTCAAAGC AGCACTGAAA 

4551 GAAAGGGAGC AGCTGGAACA 

4601 TCACATTACA GAATGCCTGA 

4651 AGTGGAGGAA AAATTTGTGG 

4701 GCCAGAAAGC TGCACAAACT 

4751 GTCTCAGCAA CACAATGACC 

4801 TAATCAGAAA TCCAGATGTG 

4851 GATAGTAGCA GGGACAGTTA 

4901 TGATCATCAC AAAGACAGGC 

4951 CCAGGAAAAG GCCATATTCA 

5001 TGGGATCACT ACAAACAGGA 

5051 AAAGTTAGAT GACCACAGGA 

5101 ACTTAAAAGA GAGCCGGGGT 

5151 AGGATACACT CAGATCACCG 

5201 TTCGAGAGAT TATAGATACC 

5251 CTGGTAGTGG CCCGAGGTCA 

5301 AGATCTCCCC TAGGACACAG 

5351 AAGTACACCT GAACATACAT 

5401 CATTTTCTGG ACCTTCTTTT 

5451 TGCCTTACAT GACTTGAAAG 

5501 ATTGTTACTT CTTTCCAGGA 

5551 ATATTTTTGT" ATTTAAAGTT 

5601 ACTTTTTTTT TAAGAAATGG 

5651 CTGCCCCTTT CAGACTGGAT 

5701 TTCTAGGCTG AACACAGATT 

5751 CTGACCTGTG CTTATGTTTC 

5801 TTTCTTGGTA GAGAACTCTC 

5851 GTTTACATTG TACACTGCGA 

5901 ATATTTAAAT TCTGTACCTA 

59151 TTGTGATCAG TTATAATGCC 

6001 CAATTAAAAA AAAAAACACA 

6051 TAAATTAATT AAATGAGCTT 

6101 TTCCCCAACA ACTCAGGCCT 

6151. TTTAATAAAA TATCTCGATG 

6201 ATTTTATTCC ATTTAGTGCT 

6251 TTTTTGGTTG TTTTATTTTA 

6301 TGATTGTTGT AATGAACAGT 

6351 AAAGCTTTTC AGGTGCATTG 
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TTAGTAATTC AGAGAATGGA CACGACAGGA 
TTCAACTCCA TCAAGCTCTA CACCTTTTAA 
TTTTGAAGTT TGGTGCTGAG GAACTCTTTA 
CAGGAGCCCC AGGAAATGGA TATAGATGAA 
TCGGGAAAAT GAGCCAGGTC CATTGACTGT 
AGTTCAAGGT GGCGAACTTT TCCAATATGG 
GAACCAGAAA GAAATTCAAG AAATTGGGAA 
ACGGAGAAGG ATAGAGGAGG AGGAAAGACA 
ACATGCTCCC GAGGATGAGA AACTGTGCAA 
AGTGAAGGAA GACGCAGTAG GAGCAGAAGA 
CTCCATCACA .GAAAGAAAAC GGCCAAAAAA 
TTCCTCGAGA AAATATTAAA GGATTTAGTG 
ATCAAGAGTT ACAAGAAATT TGGTGGCCCT 
AGCTAGAGAT GCTGAACTGG TTGATAAATC 
TGGGTGAACT TGTACATAAT GGATGCATTA 
TCTGGACAAG AAAGAGCAGG AGGTAGACTT 
GTTTCGAATC TCAGGAGTGC AGGTGAATGC 
AAGAAGAGCT GGCACCACTG CACAAATCCA 
AGGAAAAGAT ATGTCATCCC ATGCCACACC 
AGATTGGGGT AAAGAAGATG ATTCCAATCT 
ATGGCTATGG CAGCTGGGAA ATGATAAAAA 
ACACAGAAGA TTTTACCTGA TGATCCAGAC 
GCTACAGACC CGTGCAGACT ACCTCATTAA 
CAAGAAAGGA AGCACAAAGG CTTGCTGGTG 
AAGACAAGAA ATAAGAAGAA TAAGATGAAG 
AATAAAGAGT GATTCTTCAC CACAACCCTC 
ATGAGGAGGA GGATAACAAG GTAAATGAAA 

AAATCTAAAA AAATTCCATT GCTGGATACT 
CAGTGAACCA GTTCCTATCT CAGAAGAATC 
CATTTAGTGT GTGCAAAGAA AGAATGAGGC 
CAGCTGGATA GACCAGAGAA GGGCCTTTCT 
TACTAGGCAG TGTCTAATCA AAATTGGGGA 
AGGAGTACAC AAATCCCGAG CAAATAAAAC 
ATTTTTGTGT CCAAGTTTAC AGAATTTGAT 
CTACAAACAT GCAATCAAAA AGCGCCAAGA 
AAAACATTAG CAGCAATGTG AATACACATG 
GAAAGACTGA AGGAGACTAC AAACCATGAT 
TTCTTCTGAT AGACATTTAT CACAATACCA 
ATCAGGGAGA TGCTTACAAG AAAAGTGACT 
GCCTTCAGTA ATGGAAAAGA TCACAGAGAC 
CAGCAGATAC TACAGTGATA GTAAACATAG 
GCAGAGACCA CAGGTCAAAC CTGGAAGGAA 
CATTCAGATC ACCGCTCCCA TTCAGACCAC 
TTCCACTTCA GAATACAGCC ATCATAAATC 
ACTCAGACTG GCAAATGGAC CACAGAGCTT 
CCACTAGATC AGAGGTCTCC TTATGGTTCA 
ATCTCCATTT GAACACTCAT CAGATCACAA 
GGAGTAGCCG GAAGACATAA GAAAGACTGA 
TAGCCATATA CAGTAAACTA ACACAGTAAT 
ATATGGACTG GATATTCTAT CAGTAGCAGT 
TGCAAGGTCT ATTATCCCAA CAGAAGAAAA 
TATOCTGCAC TCTGCTGCRA ATGTTGTGGC 
AAGATGTTTA CTTTTACAGG GACCTCAACA 
CTTACTATAA AACTCTTCAT GTCAAAGTGG 
AAATTATGTT TGTAAATGAA CACTTAAACA 
AGGAAAGAAT GGGGGATTTA TTTTGTTTTA 
AAGGACTTTG TTCACTTTCC AAAGCTACTT 
CCACCTTGCC GCTTTTCATC ACAAGCTTGA 
CAGTTGTAAA ATAGCCAGGA TTTCTCCTGT 
TTTTTATGAA ACAAACAAAC AAACAAAAAA 
ACAAAACCAA CAAATGGCTG TAAATTATTG 
TTTTCCGTCA GGCTTTTTTT GGCTGTTCCT 
TCTTTTCACA AAGTCAGTAT ACTTACATGT 
GAATCAGAAT GTAAAAATGG GGAAGGGAAT 
CCTTTTTTAT TGGATACTTT TACATACCTG 
TTTTTTTTTT CTATTAAACT GTCAGTGTTG 
GAGAATATCC CACTCTAAAC TGTGCCCTGG 
GTTTAAAAGA AGGAAGTGTT CTATAGGTGA 
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6401 ACACTTCAAA ACCCAGATCA GCCAAGATTC ATTGTAAATC CATTTGTTTT 

6451 CCCTCTTTAA CATGGGCAAT AATGTCAAAT GTGCTATGCA GCAGTTAATA 

6501 TTTTAGAAGA TTTGAATGAC TTTATTAACA GAATTGTTAC AATGCACACT 

6551 GATTGTACAT AGATAACTTC TATCTGACAA ATTAAATTAA CTAAAACCAA 

6601 AAAAAACC 
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Figure 6. 
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Figure 7. 



DEIVSVKHLHKKIKTE 
CBD-1A 1 GATGAGATTGTTTCAGTGAAACATCTACATAAAAAAATAAAAACAGAAA 
CBD-W 1 GATG<X^TTGTTTCAGTGAAACATCCACATAAAAAAATAAAAGCAGAAA 

DGlVSVKBPflKKIKAE 

KENBEKPEPDIGIKKEA 

CBD-lA 51 AAACIkAAATGAAGAAAAGCCTGAGCCAGATACTC^ 

CBD-W 51 AAAGAAAATGAAGAAAAAGATGAGCCAGAGATTGGTATAAAGAAGGAAGCT 
KBNEEKDEPEIGIKKEA 

EEKRETKEKENKRELKR 
CBD-1A 101 GAAGAAAAAAGAGAGACAAAAGAGAAGGAAAATAAAAGGGAATTGAAAAGG 
CBD-W 101 GGAGAAAAAAGAGAGACAAAAGAAAAGGAAAATAAGA 

GEKRETKEKENK 

EKKB KE DKKBLK EK DNK 
CHD-IA 151 GAGAAAAAAGAAAAAGAGGATAAGAAAGAATTAAAAGAAAAAGATAATAAA 

EKRENKVKESTQKEKEV 
CBD-1A 201 GAAAAGAGAGAAAACAAAGTAAAAGAATCCACACAGAAAGAAAAAGAAGTG 

KEEK 
CBD-1A 251 AAGGAAGAGAAG 
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Figure 8. 



ATTTATCGGC TAGTCACAAA AGGATCAGTA GAAGAAGATA TTCTTGAAAG AGCCAAGAAA AAGATGGTGT TAGATCATIT 
10 20 30 40 50 60 70 80 

AGTGAITCAG AGAATQGACA CCACAOQGAA AACTGTACTA CATACAGOCT CTACTCCTCC AAGCTCAACA CCTTTTAATA 
90 100 110 120 130 140 150 ISO 

AGGAAGAGTT ATCAGCAATT TTGAAGTTTG GTQCTGAGGA ACTTTTTAAA GAACCTGAAH NNGAAGAAGA GGAOCCTCAG 

170 ISO 190 200 210 220 230 240 

GAGATGGAIA TAGATGAAAT CCTGAAGAGG NCTGAAACTC GAGAAAATGA GTCAOOCCCA TTAACTGTAG GAGATGAGTT 

250 260 270 260 290 300 310 320 

ACTTTCACAG TTCAACGTAG CTAACTITIC CAATATGGAT GAAGATGACA TTGAATTOGA AGCAGAACAA AATCTAAGAA 

330 340 350 360 370 380 390 400 

ACTGGGAAGA AATCATTCCA GAAGTTCAGT GGCGAOGAAT AGAGQQGNN3 GAAAGACAAA AAGAACTTGA AGAAATAIAT 

410 420 430 440 450 460 470 480 

ATGCTTCCAA GAATGAGAAA CTGTGCAAAA CAGATCAGCT TTAATOGAAA TGAAGGGAGA TGCAGTAGGA GCAGAAGAXA 

490 500 510 520 530 540 550 560 

TTCTQGATCT GATAGTGATT CCATCTCAGA AAGAAAACGA CCAAAAAAAC GTQGACGACC ACGAACTA3T CCOCGTGAAA 

570 580 590 600 610 620 630 640 

ACATTAAAGG ATTTAGTGAT GCAGAGATTA GACGATTTAT CAAGAGTTAC AAGAAATTTG GTGGCCCAGT TGAAAGGTTA 

650 660 670 680 690 700 710 720 

GATGCTAXAG CTAGAGATOC TGAGCTAGTT GATAAA3CTG AAACAGACCT TAGACGTCTG QGAGAACTTG XACATAATQG 

730 740 750 760 770 780 790 800 

ATGCATTAAG GCTTTAAATG ATAATGACTT TGGTCAAGGA AGAACAGGTG GTAGATTTGG GAAAGTTAAA GGCCCAACAT 

810 820 830 840 850 860 870 880 

TOCGAAXAGC AGGAGTGCAG GTGAATGCAA AGCTAGTCAT TTCTCACGAA GAAGAGTTQG CACCATTOCA TAAA3CGATT 

690 900 910 920 930 940 950 960 

CCTTCAGAIC CAGAAGAAAG GAAAAGAXAT CTCATCCCAT ACCACACCAA AGCAGCTCAI 1TTGATATAG ATTGGGGZAA 

970 980 990 1000 1010 1020 1030 1040 

AGAAGATGAT TOCAATCTGT TAAXAGGCAT CTATGAATAT GGTTATGGCA GTTGOGAAAT GAXAAAAASG GA1CCTGA3C 
1050 1060 1070 1080 1090 1100 1110 1120 

TCAGTTTGAC ACAGAAGATT TTACCTGATG ATCCAGAIAA GAAACCCCAG GCTAAGCAG? TACAGACTCG TGCAGATTAC 
1130 1140 1150 1160 1170 1180 1190 1200 

CTCATTAAAT TACTGAAIAA AGACCOTOCA AGAAAGGAAG CACAGAGACT TOCTQGTOCA GOCAATTCAA AGAGGAGAAA 
1210 1220 1230 1240 1250 1260 1270 1280 

AACAAGAAGT AAGAAGAATA AAGCAACAAA GGCTGC 
1290 1300 1310 
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Figure 9. 



C CBD-1A 
M CED-1 



DARRYI/3QCLOTL*RIASQlHTrENIRSTFKGI 

F AIOTVTQREPQETRECRKT IfB ILIFEBIC I BIBLLLIGOFCf INFLIF7KNGS5DKB 



******** 



C CBD-lA SVRNSSGESSRSDDDSAGSASGSGSGSSSGSSS^ 

if CHD-J SVW*£GE£SQSGDD-<XKAS(^GS<KSSGS^ 

C CHD- JA TSREKKQVQAKPPKADGSE FWKSSPS I1AVQRSAVLKK0QQQ QKAASSDSGSEEDSS 

/f CHD-J TSRZKX-VQA]0>P1WIX^FWKSSPSII^ 



C CBD-lA 
H CHD-J 



SSEDSADDSSSETIOOTCHKISDWQHSGSGSVSGTCSD^ 
SSEDS-DDSSSGWOOTHNDBDWQHSGSCTSQL^ 



C CHD-IA 
If CHD-J 



KV3CSPKPTSRIKPKSOQCSTOQKXPQII3SSKEFK DDDE DYDKRGSRRQA3VNVS YKEAEK 
KVRSRJ^^KSFMIKXILGQKKRQIDSSEDEDDEDy^ 



C CHD-JA 
K CHD-J 



TKTDSDDLIZVCOTDVPQTESDEFZTIBEF^ 
KKTDSDDLLBVCGEDVPQPEDEEFCTIEIWKDCRVGH]^^ 



C CHD-JA 
CHD-J 



HUMAN 
C CHD-IA 
Jf CHD-J 



KTXEPpE IQYLDCHKGWSB IBNIVgTEETLKQQK ^RGKKKLDNYKKKDQETKRWLKKXS 
IOYLIKWKGWSHIHNTHKTEETLKQQN ^KGMNKLDHYKKKD^CTKRWLKNAS 
ERKKZPtolQyLIKHKGWSHIHNTHETKETI^QQN H^GMKKIiWYraWDQKTraiHLKKAS 



PEDVEYYNCQQELTDDIjHKQYQIVERTNXSFOSKSAAGYP 

PEDVEYYICQQELTDDLHKQYQIVERI IAHSNQKSAAGTPD YYCK»QGLP1SECSWEDG\ 
PEDVEYYNCQQELTDDLHKQYQIVERI IAHSNQKSAAGLPOYYCKWQGLPTSECSWE DGA 



C CHD-JA 
H CHD-J 



C CHD-JA 
H CHD-J 



LIAKKTQARIDETFSTttJQSKTTPFXDCKVW 

LISKXFQTC IDEYF SIO^m'PE WXTKVIAQRPREVALKKQPSIIOGflE JLELRDTQLN 



CTJWIAHSWCKGNSCILADEMtarfOTIQa^ 
GIWWLABSWOKCaJSCIIADEJlCTX^I^ 



C CHD-JA 



EIQTKASQMNAWYIX3>INSRKHIRTHZWH^^ 



C CBD-lA 
CHD-J 



HAF IGVDEAHRLKNDDSIXYRTLI DFTSNHRLLI TGTPLQNSLKELWSLLfiF IMPEXFSS 



C CBD-1A 



VreDFEEEBatGREYGYASIJflrci^PFnjtfCVratfWEX^ 

WEDFEEE BGXGRE YGYA5 LHI^IZPF LLRRVKKTATSRSIJAKVEQI IJIKEMSA 



C CHD-iA 
if CHD-J 



WILTRKYKA1£KGSKGSTSGFLNIMHELKKCCNHCY1»IKPPDDNEFYNK(^ALQHLIR^ 
HI LTWffKALSKGSKGS TSG7LN I MME LKKCCNBC YL I KPPDNHEP YNKQEALQHLI RSS 



C CHD-JA 
H CBD-1 



GKLI LLDKLL I RLRERCNR VL I FSQKVHKID ILAE YLK YKQFPFQRIDGS IKGEIilKQAL 
G^ILIJ>KIAIRIJlER(TOVLIFSQHVRMIi>I^^ 
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C CBD-1A 
M CBD-1 



C CBD-W 
C CBD-1A 
M CBD-1 



C CBD-W 
C CBD-1A 
H CBD-1 



DHFNAEGSEDFCriXSTRA3GLGIKLASAmVVI FDSDW^NDLQAQARAHRlGtf KXQV 
DHFTttEGSEDFCFIJ^TRAXLGINIASAD^ 



-IYRLVTKGSVEEDILERAKKXMVLDHLVI* 
HI YRLVTXGSVEK D HJEKAKKKHVLDHLVT* 
NIYRLVTXCSVEEDILERAKKKMVLDBLVK 



:Q IMDTTGXWLBTGSTPSSSTPFNKEELSA 
^^VTOGSV^KDIIJCRAKKKMVLDHLVIC ^MDTTGKTVLBTGSAPSSSTPFNKKgLSA 

£ WDTPGKlVLHTGSTPSSSTPmKEELSA 



ILKFGAEELFKEPEXEEEEPQEMDIDE IlWUCETT^NESG^LTVGDBLLSOrKVANTSNM 
tLKreAEE LFyEP EGZgQEFQBMDIPE ILKRAETHBNEPGPLSV<3SLLSQF1CVANPSNM 
ILKrGAEELTKEPEGEEQEPQEMDIDE IIJCRAETR25NEPGPLTVGZ>EIXSQfKVANrSNM 



C CBD-tf 
C CHD-1A 



C CflD-K 
C CHO-1A 
M CBD-1 



C CBD-W 
C CBD-1A 
H CBD-1 



C CBD-W 
C CBD-1A 
M CBD-1 



C CBD-W 
C CBD-1A 
H CBD-1 



C CBD-W 
C CBD-1A 
H CBD-1 



C CBD-1A 
H CBD-1 



C CBD-1A 
H CBD-1 



C CBD-1A 
H CBD-1 



C CBD-1A 
H CBD-1 



DEDDIXLEPEQHLRNWEB IIPEVCWRRIEGXEROKBIZEI JISTNGNEO 
DEODIZLEPEHN5KNHEZ I IPB£QRRRLEKKERQKZLES ] 
DEDDIEISPERNSRMHEE IIPESQRRRIEXEERQKELEZ 1 



LEEIYWLPRHRNCAF JISfNSSEG 



KIYHLPRKRNCAK JISFNGSEG 



RCSRSRRYSGSDSDSISEKKRPKKRGRPRTEPRENIKGrSDAB IRRF IKSYXKFGGPVER 
RRSRSRRYSGSDSDSISEJlKWIOTCTPRTIPRENlKGrsnAE IRRF IXSYKXFGGPIiR 
RRSRSRRYSGSDSDSITERKRPKKRGRPRTIPREmKGFSI^ 



LDAIARDAELVDKSBTOLRRLGELVHNGC IKAI2*Dlfl}FGQGRTGGRJGJ^GPTFRX&GV 
LDAIARDA&LVDKSETOLJtfUiGELVro 

IJDAVARnAELVDKSKTDLRRLGELVHNOC IKALKDNSSQQERAGGRLGKVKGPTFRI SGV 
A**.**************************** >#v * 

QVNAKLTOSHEEEIAPiaKSIPSnPEERiaiYVI^ 

QVNAKLVIAHEDEI^LHKSIPSDPEER^YTIPCHTKA^ 

QVHMXVISHEKLAPLHKSIPSDPZEIWRYVira 



YGYGSW5KIKM>PDI^LTC£IU , IOTDFTOQAK^ 
YGYGSWEMIKHDPPI£LTQXIIJ>DCH>DKKPO^ 



AGNSKRRKTRSKKHRATXJUt 

AGGSKIWKTFAKKSKVJKSIKVlOTIKSl^SSPLPSEKSDr^DD- KLNDSKFE5KDRS 

ASNSJOWKTRKKKNK-MFASKIKEEIKSDSSP^ 



KKSVVSDAFVHITASCZIWIAEESEEIJDCPTrSICKEBMRFV^ 
KKIl>IIJOTVBItATSEPWISEZSEBXBaCTrSVCKEKK^ 




KKK JESQQNSDQN-SWATraVIRNPDMERLKE NINHDDSSRDS YSSDRHLSQYHDHHKD 
KKR }ESQQHNDQNI SSNVNTOVIRNP DVE RLKE TTNH DDSSRDS YSSDRHLSQ YHDH HKD 



**• ***** .*** *. 



RBQGDSYXKSDSRJOTYSSFSttTOBREira 

KBQGTAYKX5DSKXKP YSAFS^GXDHKDWtS YKQDSRYYSDS -KHRKLDCSRSRDHRSNL 



C CBD-1A 
H CBD-1 



C CBD-1A 
H CBD-1 



EQGUCD-RCHSDHRSHSDHRMHSPHRSTPSTH I INPPRDYRYLSDWOLDHRAASSGPRSP 




LDQRSPYGSRSP 



TEHSAEHRSlTEHTHSSRXTXQKLMSLSSGTLrXP 



U5QRSPYGSRSPU^R5PFXHSSDHKSTPEBTWSSRXTX^U*T7SCIPSrXPYTVNXBSNC 



C CBD-1A LTXLERYCIi>II^AVUJJ£RHQGlXS^^ 

C CHD-1A VQCP&CVTVIXXrnTl&VKV^ 

C CBD-1A KS<Xn£SLSK^LBCTIiaTCW r SSOAXirKrCTYSCK 

C CBD-1A OKTIXOTnTXPm«LmHXMSFrPSGr^^ 

C CSD-J* SECJQCTCRIOTLVUjmUaiCFVLFYTIFrrc^ 

C CHD-1A ArQVBWITOUWCSIGZHFKTQISQDSIJtlHIJSIJ^ 

C CflD-lA LLQCTLIVHRXIiSDKLWXLKPKKT 
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Figure 10." 
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Figure 11. 
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Figure 13. 
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Figure 14. 
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Figure 16. 
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